An air quality index (AQI) is used by government agencies to communicate to the public how polluted the air currently is or how polluted it is forecast to become. Public health risks increase as the AQI rises.
There are six AQI categories, namely Good, Satisfactory, Moderately polluted, Poor, Very Poor, and Severe. The proposed AQI will consider eight pollutants (PM10, PM2.5, NO2, SO2, CO, O3, NH3, and Pb) for which short-term (up to 24-hourly averaging period) National Ambient Air Quality Standards are prescribed.
Based on the measured ambient concentrations, corresponding standards and likely health impact, a sub-index is calculated for each of these pollutants. The worst sub-index reflects overall AQI. Likely health impacts for different AQI categories and pollutants have also been suggested, with primary inputs from the medical experts in the group.
In this project we mainly focused on cleaning of data and tried to interpret various conclusions and visualisations from the collected data by using various libraries.
We tried to visualise the yearly data of every pollutant, tried to find the most polluted and the least polluted city based on the data - station wise as well as city wise.
At last we worked on a hypothesis testing, which works around the quality of air before and after COVID-19.
Data collected from: https://www.kaggle.com/rohanrao/air-quality-data-in-india
| StationId | StationName | City | State | Status | |
|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | Active |
| 1 | AP002 | Anand Kala Kshetram, Rajamahendravaram - APPCB | Rajamahendravaram | Andhra Pradesh | NaN |
| 2 | AP003 | Tirumala, Tirupati - APPCB | Tirupati | Andhra Pradesh | NaN |
| 3 | AP004 | PWD Grounds, Vijayawada - APPCB | Vijayawada | Andhra Pradesh | NaN |
| 4 | AP005 | GVM Corporation, Visakhapatnam - APPCB | Visakhapatnam | Andhra Pradesh | Active |
| ... | ... | ... | ... | ... | ... |
| 225 | WB010 | Jadavpur, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 226 | WB011 | Rabindra Bharati University, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 227 | WB012 | Rabindra Sarobar, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 228 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 229 | WB014 | Ward-32 Bapupara, Siliguri - WBPCB | Siliguri | West Bengal | NaN |
230 rows × 5 columns
| StationId | StationName | City | State | Status | |
|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | Active |
| 4 | AP005 | GVM Corporation, Visakhapatnam - APPCB | Visakhapatnam | Andhra Pradesh | Active |
| 5 | AS001 | Railway Colony, Guwahati - APCB | Guwahati | Assam | Active |
| 10 | BR005 | DRM Office Danapur, Patna - BSPCB | Patna | Bihar | Active |
| 11 | BR006 | Govt. High School Shikarpur, Patna - BSPCB | Patna | Bihar | Active |
| ... | ... | ... | ... | ... | ... |
| 224 | WB009 | Fort William, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 225 | WB010 | Jadavpur, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 226 | WB011 | Rabindra Bharati University, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 227 | WB012 | Rabindra Sarobar, Kolkata - WBPCB | Kolkata | West Bengal | Active |
| 228 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | Active |
133 rows × 5 columns
| StationId | StationName | City | State | |
|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh |
| 1 | AP005 | GVM Corporation, Visakhapatnam - APPCB | Visakhapatnam | Andhra Pradesh |
| 2 | AS001 | Railway Colony, Guwahati - APCB | Guwahati | Assam |
| 3 | BR005 | DRM Office Danapur, Patna - BSPCB | Patna | Bihar |
| 4 | BR006 | Govt. High School Shikarpur, Patna - BSPCB | Patna | Bihar |
| ... | ... | ... | ... | ... |
| 128 | WB009 | Fort William, Kolkata - WBPCB | Kolkata | West Bengal |
| 129 | WB010 | Jadavpur, Kolkata - WBPCB | Kolkata | West Bengal |
| 130 | WB011 | Rabindra Bharati University, Kolkata - WBPCB | Kolkata | West Bengal |
| 131 | WB012 | Rabindra Sarobar, Kolkata - WBPCB | Kolkata | West Bengal |
| 132 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal |
133 rows × 4 columns
| StationId | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 0.17 | 5.92 | 0.10 | NaN | NaN |
| 1 | AP001 | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 0.20 | 6.50 | 0.06 | 184.0 | Moderate |
| 2 | AP001 | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 0.22 | 7.95 | 0.08 | 197.0 | Moderate |
| 3 | AP001 | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 0.29 | 7.63 | 0.12 | 198.0 | Moderate |
| 4 | AP001 | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 0.17 | 5.02 | 0.07 | 188.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 108030 | WB013 | 2020-06-27 | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 1.32 | 7.26 | NaN | 50.0 | Good |
| 108031 | WB013 | 2020-06-28 | 11.80 | 18.47 | NaN | NaN | NaN | NaN | 0.68 | 3.49 | 38.95 | 1.42 | 7.92 | NaN | 65.0 | Satisfactory |
| 108032 | WB013 | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 3.52 | 8.64 | NaN | 63.0 | Satisfactory |
| 108033 | WB013 | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 1.86 | 8.40 | NaN | 57.0 | Satisfactory |
| 108034 | WB013 | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 1.31 | 7.39 | NaN | 59.0 | Satisfactory |
108035 rows × 16 columns
| StationId | StationName | City | State | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 0.17 | 5.92 | 0.10 | NaN | NaN |
| 1 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 0.20 | 6.50 | 0.06 | 184.0 | Moderate |
| 2 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 0.22 | 7.95 | 0.08 | 197.0 | Moderate |
| 3 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 0.29 | 7.63 | 0.12 | 198.0 | Moderate |
| 4 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 0.17 | 5.02 | 0.07 | 188.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 107706 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-27 | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 1.32 | 7.26 | NaN | 50.0 | Good |
| 107707 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-28 | 11.80 | 18.47 | NaN | NaN | NaN | NaN | 0.68 | 3.49 | 38.95 | 1.42 | 7.92 | NaN | 65.0 | Satisfactory |
| 107708 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 3.52 | 8.64 | NaN | 63.0 | Satisfactory |
| 107709 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 1.86 | 8.40 | NaN | 57.0 | Satisfactory |
| 107710 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 1.31 | 7.39 | NaN | 59.0 | Satisfactory |
107711 rows × 19 columns
| StationId | StationName | City | State | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 0.17 | 5.92 | 0.10 | 184.0 | Moderate |
| 1 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 0.20 | 6.50 | 0.06 | 184.0 | Moderate |
| 2 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 0.22 | 7.95 | 0.08 | 197.0 | Moderate |
| 3 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 0.29 | 7.63 | 0.12 | 198.0 | Moderate |
| 4 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 0.17 | 5.02 | 0.07 | 188.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 107706 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-27 | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 1.32 | 7.26 | NaN | 50.0 | Good |
| 107707 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-28 | 11.80 | 18.47 | 13.65 | 200.87 | 214.20 | 11.40 | 0.68 | 3.49 | 38.95 | 1.42 | 7.92 | NaN | 65.0 | Satisfactory |
| 107708 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 3.52 | 8.64 | NaN | 63.0 | Satisfactory |
| 107709 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 1.86 | 8.40 | NaN | 57.0 | Satisfactory |
| 107710 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 1.31 | 7.39 | NaN | 59.0 | Satisfactory |
107711 rows × 19 columns
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | NaN | NaN | 0.92 | 18.22 | 17.15 | NaN | 0.92 | 27.64 | 133.36 | 0.00 | 0.02 | 0.00 | NaN | NaN |
| 1 | Ahmedabad | 2015-01-02 | NaN | NaN | 0.97 | 15.69 | 16.46 | NaN | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | NaN | NaN |
| 2 | Ahmedabad | 2015-01-03 | NaN | NaN | 17.40 | 19.30 | 29.70 | NaN | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | NaN | NaN |
| 3 | Ahmedabad | 2015-01-04 | NaN | NaN | 1.70 | 18.48 | 17.97 | NaN | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | NaN | NaN |
| 4 | Ahmedabad | 2015-01-05 | NaN | NaN | 22.10 | 21.42 | 37.76 | NaN | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 15.02 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 2.24 | 12.07 | 0.73 | 41.0 | Good |
| 29527 | Visakhapatnam | 2020-06-28 | 24.38 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 0.74 | 2.21 | 0.38 | 70.0 | Satisfactory |
| 29528 | Visakhapatnam | 2020-06-29 | 22.91 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 0.01 | 0.01 | 0.00 | 68.0 | Satisfactory |
| 29529 | Visakhapatnam | 2020-06-30 | 16.64 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 0.00 | 0.00 | 0.00 | 54.0 | Satisfactory |
| 29530 | Visakhapatnam | 2020-07-01 | 15.00 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | NaN | NaN | NaN | 50.0 | Good |
29531 rows × 16 columns
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | NaN | NaN | 0.92 | 18.22 | 17.15 | NaN | 0.92 | 27.64 | 133.36 | 0.00 | 0.02 | 0.00 | NaN | NaN |
| 1 | Ahmedabad | 2015-01-02 | NaN | NaN | 0.97 | 15.69 | 16.46 | NaN | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | NaN | NaN |
| 2 | Ahmedabad | 2015-01-03 | NaN | NaN | 17.40 | 19.30 | 29.70 | NaN | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | NaN | NaN |
| 3 | Ahmedabad | 2015-01-04 | NaN | NaN | 1.70 | 18.48 | 17.97 | NaN | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | NaN | NaN |
| 4 | Ahmedabad | 2015-01-05 | NaN | NaN | 22.10 | 21.42 | 37.76 | NaN | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 15.02 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 2.24 | 12.07 | 0.73 | 41.0 | Good |
| 29527 | Visakhapatnam | 2020-06-28 | 24.38 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 0.74 | 2.21 | 0.38 | 70.0 | Satisfactory |
| 29528 | Visakhapatnam | 2020-06-29 | 22.91 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 0.01 | 0.01 | 0.00 | 68.0 | Satisfactory |
| 29529 | Visakhapatnam | 2020-06-30 | 16.64 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 0.00 | 0.00 | 0.00 | 54.0 | Satisfactory |
| 29530 | Visakhapatnam | 2020-07-01 | 15.00 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | NaN | NaN | NaN | 50.0 | Good |
29531 rows × 16 columns
(107711, 19)
StationId 0 StationName 0 City 0 State 0 Date 0 PM2.5 20417 PM10 41789 NO 15629 NO2 15058 NOx 14346 NH3 47245 CO 11386 SO2 23922 O3 24213 Benzene 30164 Toluene 37453 Xylene 84595 AQI 18958 Air_quality 18958 dtype: int64
Using the missingo library for getting the visual interpretation of missing values, so that we can replace it with some other values.
<AxesSubplot:>
Your selected dataframe has 19 columns. There are 14 columns that have missing values.
| Missing Values | % of Total Values | |
|---|---|---|
| Xylene | 84595 | 78.500000 |
| NH3 | 47245 | 43.900000 |
| PM10 | 41789 | 38.800000 |
| Toluene | 37453 | 34.800000 |
| Benzene | 30164 | 28.000000 |
| O3 | 24213 | 22.500000 |
| SO2 | 23922 | 22.200000 |
| PM2.5 | 20417 | 19.000000 |
| AQI | 18958 | 17.600000 |
| Air_quality | 18958 | 17.600000 |
| NO | 15629 | 14.500000 |
| NO2 | 15058 | 14.000000 |
| NOx | 14346 | 13.300000 |
| CO | 11386 | 10.600000 |
<class 'pandas.core.frame.DataFrame'> Int64Index: 107711 entries, 0 to 107710 Data columns (total 19 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 StationId 107711 non-null object 1 StationName 107711 non-null object 2 City 107711 non-null object 3 State 107711 non-null object 4 Date 107711 non-null datetime64[ns] 5 PM2.5 87294 non-null float64 6 PM10 65922 non-null float64 7 NO 92082 non-null float64 8 NO2 92653 non-null float64 9 NOx 93365 non-null float64 10 NH3 60466 non-null float64 11 CO 96325 non-null float64 12 SO2 83789 non-null float64 13 O3 83498 non-null float64 14 Benzene 77547 non-null float64 15 Toluene 70258 non-null float64 16 Xylene 23116 non-null float64 17 AQI 88753 non-null float64 18 Air_quality 88753 non-null object dtypes: datetime64[ns](1), float64(13), object(5) memory usage: 16.4+ MB
column name:StationId unique values:108 column name:StationName unique values:108 column name:City unique values:24 column name:State unique values:21 column name:Date unique values:2009 column name:PM2.5 unique values:22392 column name:PM10 unique values:29547 column name:NO unique values:11914 column name:NO2 unique values:12051 column name:NOx unique values:15585 column name:NH3 unique values:9112 column name:CO unique values:2353 column name:SO2 unique values:5802 column name:O3 unique values:11161 column name:Benzene unique values:3018 column name:Toluene unique values:8714 column name:Xylene unique values:1893 column name:AQI unique values:931 column name:Air_quality unique values:7
Visualising the yearly data of every pollutant
We're making a column which only comprises of Benzene + Toluene + Xylene because of its same biological nature.
We're making a Patriculate_Matter only column.
| StationId | StationName | City | State | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | AQI | Air_quality | BTX | Particulate_Matter | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 184.0 | Moderate | 6.19 | 187.11 |
| 1 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 184.0 | Moderate | 6.76 | 205.90 |
| 2 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 197.0 | Moderate | 8.25 | 207.38 |
| 3 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 198.0 | Moderate | 8.04 | 224.08 |
| 4 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 188.0 | Moderate | 5.26 | 168.27 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 107706 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-27 | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 50.0 | Good | NaN | 25.11 |
| 107707 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-28 | 11.80 | 18.47 | 13.65 | 200.87 | 214.20 | 11.40 | 0.68 | 3.49 | 38.95 | 65.0 | Satisfactory | NaN | 30.27 |
| 107708 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 63.0 | Satisfactory | NaN | 50.86 |
| 107709 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 57.0 | Satisfactory | NaN | 55.37 |
| 107710 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 59.0 | Satisfactory | NaN | 47.00 |
107711 rows × 18 columns
(29531, 16)
City 0 Date 0 PM2.5 4321 PM10 10866 NO 3276 NO2 3278 NOx 3980 NH3 10061 CO 1745 SO2 3510 O3 3664 Benzene 5298 Toluene 7739 Xylene 17878 AQI 4174 Air_quality 4174 dtype: int64
Using the missingo library for getting the viusal interpretation of missing values, so that we can replace it with some other values.
<AxesSubplot:>
Your selected dataframe has 16 columns. There are 14 columns that have missing values.
| Missing Values | % of Total Values | |
|---|---|---|
| Xylene | 17878 | 60.500000 |
| PM10 | 10866 | 36.800000 |
| NH3 | 10061 | 34.100000 |
| Toluene | 7739 | 26.200000 |
| Benzene | 5298 | 17.900000 |
| PM2.5 | 4321 | 14.600000 |
| AQI | 4174 | 14.100000 |
| Air_quality | 4174 | 14.100000 |
| NOx | 3980 | 13.500000 |
| O3 | 3664 | 12.400000 |
| SO2 | 3510 | 11.900000 |
| NO2 | 3278 | 11.100000 |
| NO | 3276 | 11.100000 |
| CO | 1745 | 5.900000 |
<class 'pandas.core.frame.DataFrame'> RangeIndex: 29531 entries, 0 to 29530 Data columns (total 16 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 City 29531 non-null object 1 Date 29531 non-null datetime64[ns] 2 PM2.5 25210 non-null float64 3 PM10 18665 non-null float64 4 NO 26255 non-null float64 5 NO2 26253 non-null float64 6 NOx 25551 non-null float64 7 NH3 19470 non-null float64 8 CO 27786 non-null float64 9 SO2 26021 non-null float64 10 O3 25867 non-null float64 11 Benzene 24233 non-null float64 12 Toluene 21792 non-null float64 13 Xylene 11653 non-null float64 14 AQI 25357 non-null float64 15 Air_quality 25357 non-null object dtypes: datetime64[ns](1), float64(13), object(2) memory usage: 3.6+ MB
column name:City unique values:26 column name:Date unique values:2009 column name:PM2.5 unique values:11717 column name:PM10 unique values:12572 column name:NO unique values:5777 column name:NO2 unique values:7405 column name:NOx unique values:8157 column name:NH3 unique values:5923 column name:CO unique values:1780 column name:SO2 unique values:4762 column name:O3 unique values:7700 column name:Benzene unique values:1874 column name:Toluene unique values:3609 column name:Xylene unique values:1562 column name:AQI unique values:830 column name:Air_quality unique values:7
Visualising the yearly data of every pollutant
We're making a column which only comprises of Benzene + Toluene + Xylene because of its same biological nature.
We're making a Patriculate_Matter only column.
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | AQI | Air_quality | BTX | Particulate_Matter | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | NaN | NaN | 0.92 | 18.22 | 17.15 | NaN | 0.92 | 27.64 | 133.36 | NaN | NaN | 0.02 | NaN |
| 1 | Ahmedabad | 2015-01-02 | NaN | NaN | 0.97 | 15.69 | 16.46 | NaN | 0.97 | 24.55 | 34.06 | NaN | NaN | 12.95 | NaN |
| 2 | Ahmedabad | 2015-01-03 | NaN | NaN | 17.40 | 19.30 | 29.70 | NaN | 17.40 | 29.07 | 30.70 | NaN | NaN | 25.45 | NaN |
| 3 | Ahmedabad | 2015-01-04 | NaN | NaN | 1.70 | 18.48 | 17.97 | NaN | 1.70 | 18.59 | 36.08 | NaN | NaN | 15.57 | NaN |
| 4 | Ahmedabad | 2015-01-05 | NaN | NaN | 22.10 | 21.42 | 37.76 | NaN | 22.10 | 39.33 | 39.31 | NaN | NaN | 28.68 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 15.02 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 41.0 | Good | 15.04 | 65.96 |
| 29527 | Visakhapatnam | 2020-06-28 | 24.38 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 70.0 | Satisfactory | 3.33 | 98.47 |
| 29528 | Visakhapatnam | 2020-06-29 | 22.91 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 68.0 | Satisfactory | 0.02 | 88.64 |
| 29529 | Visakhapatnam | 2020-06-30 | 16.64 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 54.0 | Satisfactory | 0.00 | 66.61 |
| 29530 | Visakhapatnam | 2020-07-01 | 15.00 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | 50.0 | Good | NaN | 81.00 |
29531 rows × 15 columns
| StationId | StationName | City | State | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | AQI | Air_quality | BTX | Particulate_Matter | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Date | |||||||||||||||||
| 2017-11-24 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 184.0 | Moderate | 6.19 | 187.11 |
| 2017-11-25 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 184.0 | Moderate | 6.76 | 205.90 |
| 2017-11-26 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 197.0 | Moderate | 8.25 | 207.38 |
| 2017-11-27 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 198.0 | Moderate | 8.04 | 224.08 |
| 2017-11-28 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 188.0 | Moderate | 5.26 | 168.27 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2020-06-27 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 50.0 | Good | NaN | 25.11 |
| 2020-06-28 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 11.80 | 18.47 | 13.65 | 200.87 | 214.20 | 11.40 | 0.68 | 3.49 | 38.95 | 65.0 | Satisfactory | NaN | 30.27 |
| 2020-06-29 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 63.0 | Satisfactory | NaN | 50.86 |
| 2020-06-30 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 57.0 | Satisfactory | NaN | 55.37 |
| 2020-07-01 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 59.0 | Satisfactory | NaN | 47.00 |
107711 rows × 17 columns
| StationName | PM2.5 | |
|---|---|---|
| 0 | Anand Vihar, Delhi - DPCC | 152.350000 |
| 1 | Talkatora District Industries Center, Lucknow - CPCB | 134.690000 |
| 2 | DTU, Delhi - CPCB | 131.080000 |
| 3 | IGSC Planetarium Complex, Patna - BSPCB | 130.450000 |
| 4 | Jahangirpuri, Delhi - DPCC | 128.120000 |
| 5 | Wazirpur, Delhi - DPCC | 127.510000 |
| 6 | Mundka, Delhi - DPCC | 122.460000 |
| 7 | Rohini, Delhi - DPCC | 122.420000 |
| 8 | Bawana, Delhi - DPCC | 120.950000 |
| 9 | Burari Crossing, Delhi - IMD | 120.820000 |
| StationName | PM10 | |
|---|---|---|
| 0 | Anand Vihar, Delhi - DPCC | 358.120000 |
| 1 | Wazirpur, Delhi - DPCC | 277.420000 |
| 2 | Dwarka-Sector 8, Delhi - DPCC | 276.470000 |
| 3 | Mundka, Delhi - DPCC | 269.130000 |
| 4 | Jahangirpuri, Delhi - DPCC | 259.450000 |
| 5 | Sirifort, Delhi - CPCB | 252.480000 |
| 6 | Rohini, Delhi - DPCC | 247.870000 |
| 7 | NSIT Dwarka, Delhi - CPCB | 242.770000 |
| 8 | R K Puram, Delhi - DPCC | 242.410000 |
| 9 | DTU, Delhi - CPCB | 236.060000 |
| StationName | Particulate_Matter | |
|---|---|---|
| 0 | Anand Vihar, Delhi - DPCC | 509.740000 |
| 1 | Wazirpur, Delhi - DPCC | 405.510000 |
| 2 | Mundka, Delhi - DPCC | 391.590000 |
| 3 | Jahangirpuri, Delhi - DPCC | 387.750000 |
| 4 | Dwarka-Sector 8, Delhi - DPCC | 379.790000 |
| 5 | Rohini, Delhi - DPCC | 370.440000 |
| 6 | R K Puram, Delhi - DPCC | 361.170000 |
| 7 | Bawana, Delhi - DPCC | 354.700000 |
| 8 | Sirifort, Delhi - CPCB | 349.750000 |
| 9 | DTU, Delhi - CPCB | 346.270000 |
| StationName | NO | |
|---|---|---|
| 0 | Samanpura, Patna - BSPCB | 124.000000 |
| 1 | Anand Vihar, Delhi - DPCC | 90.860000 |
| 2 | Pusa, Delhi - DPCC | 73.060000 |
| 3 | DRM Office Danapur, Patna - BSPCB | 64.930000 |
| 4 | Major Dhyan Chand National Stadium, Delhi - DPCC | 57.780000 |
| 5 | R K Puram, Delhi - DPCC | 54.420000 |
| 6 | Chhatrapati Shivaji Intl. Airport (T2), Mumbai - MPCB | 53.380000 |
| 7 | Jawaharlal Nehru Stadium, Delhi - DPCC | 52.570000 |
| 8 | ITO, Delhi - CPCB | 50.600000 |
| 9 | Sirifort, Delhi - CPCB | 46.990000 |
| StationName | NO2 | |
|---|---|---|
| 0 | Anand Vihar, Delhi - DPCC | 88.720000 |
| 1 | Punjabi Bagh, Delhi - DPCC | 73.280000 |
| 2 | Rajbansi Nagar, Patna - BSPCB | 65.900000 |
| 3 | Jahangirpuri, Delhi - DPCC | 65.860000 |
| 4 | Jawaharlal Nehru Stadium, Delhi - DPCC | 63.340000 |
| 5 | R K Puram, Delhi - DPCC | 63.010000 |
| 6 | Pusa, Delhi - DPCC | 59.570000 |
| 7 | Sirifort, Delhi - CPCB | 58.860000 |
| 8 | Maninagar, Ahmedabad - GPCB | 58.850000 |
| 9 | Major Dhyan Chand National Stadium, Delhi - DPCC | 58.720000 |
| StationName | NOx | |
|---|---|---|
| 0 | Anand Vihar, Delhi - DPCC | 148.780000 |
| 1 | Samanpura, Patna - BSPCB | 141.460000 |
| 2 | East Arjun Nagar, Delhi - CPCB | 121.800000 |
| 3 | Pusa, Delhi - DPCC | 91.230000 |
| 4 | R K Puram, Delhi - DPCC | 86.360000 |
| 5 | Chhatrapati Shivaji Intl. Airport (T2), Mumbai - MPCB | 84.670000 |
| 6 | Major Dhyan Chand National Stadium, Delhi - DPCC | 82.100000 |
| 7 | Jawaharlal Nehru Stadium, Delhi - DPCC | 81.310000 |
| 8 | Sion, Mumbai - MPCB | 74.310000 |
| 9 | Victoria, Kolkata - WBPCB | 73.730000 |
| StationName | NH3 | |
|---|---|---|
| 0 | Manali, Chennai - CPCB | 65.360000 |
| 1 | Anand Vihar, Delhi - DPCC | 55.780000 |
| 2 | Jahangirpuri, Delhi - DPCC | 55.670000 |
| 3 | Rohini, Delhi - DPCC | 53.010000 |
| 4 | ITO, Delhi - CPCB | 52.060000 |
| 5 | IGSC Planetarium Complex, Patna - BSPCB | 51.240000 |
| 6 | NSIT Dwarka, Delhi - CPCB | 48.870000 |
| 7 | Patparganj, Delhi - DPCC | 48.850000 |
| 8 | Shadipur, Delhi - CPCB | 45.680000 |
| 9 | Mundka, Delhi - DPCC | 45.510000 |
| StationName | CO | |
|---|---|---|
| 0 | Maninagar, Ahmedabad - GPCB | 22.360000 |
| 1 | BWSSB Kadabesanahalli, Bengaluru - CPCB | 3.580000 |
| 2 | Shadipur, Delhi - CPCB | 3.480000 |
| 3 | Peenya, Bengaluru - CPCB | 3.010000 |
| 4 | NSIT Dwarka, Delhi - CPCB | 2.840000 |
| 5 | Central School, Lucknow - CPCB | 2.310000 |
| 6 | Lalbagh, Lucknow - CPCB | 2.260000 |
| 7 | Anand Vihar, Delhi - DPCC | 2.200000 |
| 8 | ITO, Delhi - CPCB | 2.120000 |
| 9 | Alandur Bus Depot, Chennai - CPCB | 1.950000 |
| StationName | SO2 | |
|---|---|---|
| 0 | Maninagar, Ahmedabad - GPCB | 55.250000 |
| 1 | Tata Stadium, Jorapokhar - JSPCB | 34.640000 |
| 2 | Talcher Coalfields,Talcher - OSPCB | 28.410000 |
| 3 | Pusa, Delhi - IMD | 27.630000 |
| 4 | Lodhi Road, Delhi - IMD | 23.250000 |
| 5 | North Campus, DU, Delhi - IMD | 23.250000 |
| 6 | IGSC Planetarium Complex, Patna - BSPCB | 22.980000 |
| 7 | R K Puram, Delhi - DPCC | 20.690000 |
| 8 | Punjabi Bagh, Delhi - DPCC | 20.080000 |
| 9 | Alipur, Delhi - DPCC | 19.860000 |
| StationName | O3 | |
|---|---|---|
| 0 | Punjabi Bagh, Delhi - DPCC | 230.100000 |
| 1 | Sector-51, Gurugram - HSPCB | 80.750000 |
| 2 | Manali Village, Chennai - TNPCB | 69.610000 |
| 3 | T T Nagar, Bhopal - MPPCB | 59.940000 |
| 4 | R K Puram, Delhi - DPCC | 54.780000 |
| 5 | Shastri Nagar, Jaipur - RSPCB | 53.200000 |
| 6 | Teri Gram, Gurugram - HSPCB | 52.960000 |
| 7 | Adarsh Nagar, Jaipur - RSPCB | 51.530000 |
| 8 | Sirifort, Delhi - CPCB | 49.010000 |
| 9 | Hombegowda Nagar, Bengaluru - KSPCB | 47.610000 |
| StationName | BTX | |
|---|---|---|
| 0 | Jadavpur, Kolkata - WBPCB | 220.430000 |
| 1 | Talkatora District Industries Center, Lucknow - CPCB | 56.030000 |
| 2 | Maninagar, Ahmedabad - GPCB | 37.630000 |
| 3 | Burari Crossing, Delhi - IMD | 33.190000 |
| 4 | IDA Pashamylaram, Hyderabad - TSPCB | 31.310000 |
| 5 | Fort William, Kolkata - WBPCB | 29.760000 |
| 6 | Bidhannagar, Kolkata - WBPCB | 27.680000 |
| 7 | Ballygunge, Kolkata - WBPCB | 26.130000 |
| 8 | Mandir Marg, Delhi - DPCC | 20.830000 |
| 9 | Teri Gram, Gurugram - HSPCB | 20.500000 |
| StationName | PM2.5 | |
|---|---|---|
| 0 | City Railway Station, Bengaluru - KSPCB | 9.000000 |
| 1 | East Arjun Nagar, Delhi - CPCB | 11.110000 |
| 2 | Sikulpuikawn, Aizawl - Mizoram PCB | 16.850000 |
| 3 | Manali Village, Chennai - TNPCB | 24.480000 |
| 4 | Plammoodu, Thiruvananthapuram - Kerala PCB | 27.220000 |
| 5 | Hombegowda Nagar, Bengaluru - KSPCB | 27.520000 |
| 6 | Hebbal, Bengaluru - KSPCB | 28.930000 |
| 7 | Kariavattom, Thiruvananthapuram - Kerala PCB | 28.980000 |
| 8 | SIDCO Kurichi, Coimbatore - TNPCB | 29.090000 |
| 9 | Borivali East, Mumbai - MPCB | 29.290000 |
| StationName | PM10 | |
|---|---|---|
| 0 | East Arjun Nagar, Delhi - CPCB | 6.320000 |
| 1 | Sikulpuikawn, Aizawl - Mizoram PCB | 23.340000 |
| 2 | Talkatora District Industries Center, Lucknow - CPCB | 26.860000 |
| 3 | SIDCO Kurichi, Coimbatore - TNPCB | 37.740000 |
| 4 | BWSSB Kadabesanahalli, Bengaluru - CPCB | 40.750000 |
| 5 | Lumpyngngad, Shillong - Meghalaya PCB | 41.640000 |
| 6 | Velachery Res. Area, Chennai - CPCB | 43.490000 |
| 7 | Alandur Bus Depot, Chennai - CPCB | 48.550000 |
| 8 | Kariavattom, Thiruvananthapuram - Kerala PCB | 51.490000 |
| 9 | Plammoodu, Thiruvananthapuram - Kerala PCB | 52.190000 |
| StationName | Particulate_Matter | |
|---|---|---|
| 0 | East Arjun Nagar, Delhi - CPCB | 17.430000 |
| 1 | City Railway Station, Bengaluru - KSPCB | 40.000000 |
| 2 | Sikulpuikawn, Aizawl - Mizoram PCB | 40.190000 |
| 3 | Sanegurava Halli, Bengaluru - KSPCB | 45.960000 |
| 4 | Alandur Bus Depot, Chennai - CPCB | 56.320000 |
| 5 | Talkatora District Industries Center, Lucknow - CPCB | 57.880000 |
| 6 | Velachery Res. Area, Chennai - CPCB | 60.770000 |
| 7 | BWSSB Kadabesanahalli, Bengaluru - CPCB | 62.450000 |
| 8 | SIDCO Kurichi, Coimbatore - TNPCB | 66.900000 |
| 9 | Lumpyngngad, Shillong - Meghalaya PCB | 72.880000 |
| StationName | NO | |
|---|---|---|
| 0 | Lumpyngngad, Shillong - Meghalaya PCB | 0.920000 |
| 1 | Bollaram Industrial Area, Hyderabad - TSPCB | 2.960000 |
| 2 | Borivali East, Mumbai - MPCB | 3.260000 |
| 3 | Plammoodu, Thiruvananthapuram - Kerala PCB | 3.410000 |
| 4 | ICRISAT Patancheru, Hyderabad - TSPCB | 3.550000 |
| 5 | Hombegowda Nagar, Bengaluru - KSPCB | 3.570000 |
| 6 | Sector-51, Gurugram - HSPCB | 3.810000 |
| 7 | Secretariat, Amaravati - APPCB | 4.450000 |
| 8 | Peenya, Bengaluru - CPCB | 4.510000 |
| 9 | Powai, Mumbai - MPCB | 4.770000 |
| StationName | NO2 | |
|---|---|---|
| 0 | Sikulpuikawn, Aizawl - Mizoram PCB | 0.390000 |
| 1 | Lumpyngngad, Shillong - Meghalaya PCB | 2.770000 |
| 2 | Borivali East, Mumbai - MPCB | 4.650000 |
| 3 | Teri Gram, Gurugram - HSPCB | 4.910000 |
| 4 | Manali Village, Chennai - TNPCB | 9.110000 |
| 5 | Plammoodu, Thiruvananthapuram - Kerala PCB | 9.190000 |
| 6 | Tata Stadium, Jorapokhar - JSPCB | 9.370000 |
| 7 | Govt. High School Shikarpur, Patna - BSPCB | 9.620000 |
| 8 | Powai, Mumbai - MPCB | 10.390000 |
| 9 | Sector-25, Chandigarh - CPCC | 11.630000 |
| StationName | NOx | |
|---|---|---|
| 0 | Lumpyngngad, Shillong - Meghalaya PCB | 1.000000 |
| 1 | Teri Gram, Gurugram - HSPCB | 5.960000 |
| 2 | Tata Stadium, Jorapokhar - JSPCB | 7.410000 |
| 3 | Plammoodu, Thiruvananthapuram - Kerala PCB | 7.550000 |
| 4 | Borivali East, Mumbai - MPCB | 7.710000 |
| 5 | Govt. High School Shikarpur, Patna - BSPCB | 9.440000 |
| 6 | Sector-51, Gurugram - HSPCB | 11.210000 |
| 7 | ICRISAT Patancheru, Hyderabad - TSPCB | 11.220000 |
| 8 | Bollaram Industrial Area, Hyderabad - TSPCB | 12.140000 |
| 9 | Sikulpuikawn, Aizawl - Mizoram PCB | 12.610000 |
| StationName | NH3 | |
|---|---|---|
| 0 | Lumpyngngad, Shillong - Meghalaya PCB | 2.810000 |
| 1 | Plammoodu, Thiruvananthapuram - Kerala PCB | 5.030000 |
| 2 | Worli, Mumbai - MPCB | 6.560000 |
| 3 | Tata Stadium, Jorapokhar - JSPCB | 7.000000 |
| 4 | Bandra, Mumbai - MPCB | 7.160000 |
| 5 | Colaba, Mumbai - MPCB | 8.020000 |
| 6 | Kariavattom, Thiruvananthapuram - Kerala PCB | 8.040000 |
| 7 | Borivali East, Mumbai - MPCB | 8.170000 |
| 8 | Nishant Ganj, Lucknow - UPPCB | 8.910000 |
| 9 | SIDCO Kurichi, Coimbatore - TNPCB | 9.310000 |
| StationName | CO | |
|---|---|---|
| 0 | Lumpyngngad, Shillong - Meghalaya PCB | 0.240000 |
| 1 | Sikulpuikawn, Aizawl - Mizoram PCB | 0.280000 |
| 2 | Borivali East, Mumbai - MPCB | 0.370000 |
| 3 | Bollaram Industrial Area, Hyderabad - TSPCB | 0.410000 |
| 4 | Worli, Mumbai - MPCB | 0.410000 |
| 5 | Kurla, Mumbai - MPCB | 0.420000 |
| 6 | Colaba, Mumbai - MPCB | 0.460000 |
| 7 | Sanegurava Halli, Bengaluru - KSPCB | 0.480000 |
| 8 | ICRISAT Patancheru, Hyderabad - TSPCB | 0.490000 |
| 9 | Sion, Mumbai - MPCB | 0.490000 |
| StationName | SO2 | |
|---|---|---|
| 0 | Kariavattom, Thiruvananthapuram - Kerala PCB | 3.230000 |
| 1 | BWSSB Kadabesanahalli, Bengaluru - CPCB | 3.810000 |
| 2 | Sanegurava Halli, Bengaluru - KSPCB | 3.880000 |
| 3 | DRM Office Danapur, Patna - BSPCB | 4.780000 |
| 4 | Silk Board, Bengaluru - KSPCB | 4.820000 |
| 5 | Zoo Park, Hyderabad - TSPCB | 4.830000 |
| 6 | Patparganj, Delhi - DPCC | 4.840000 |
| 7 | Central University, Hyderabad - TSPCB | 5.150000 |
| 8 | Jayanagar 5th Block, Bengaluru - KSPCB | 5.270000 |
| 9 | Muradpur, Patna - BSPCB | 5.310000 |
| StationName | O3 | |
|---|---|---|
| 0 | Sikulpuikawn, Aizawl - Mizoram PCB | 3.570000 |
| 1 | Sanegurava Halli, Bengaluru - KSPCB | 6.290000 |
| 2 | Govt. High School Shikarpur, Patna - BSPCB | 7.060000 |
| 3 | Muradpur, Patna - BSPCB | 11.920000 |
| 4 | Chhatrapati Shivaji Intl. Airport (T2), Mumbai - MPCB | 12.920000 |
| 5 | GM Office, Brajrajnagar - OSPCB | 16.790000 |
| 6 | Vasai West, Mumbai - MPCB | 17.010000 |
| 7 | Talcher Coalfields,Talcher - OSPCB | 17.570000 |
| 8 | Manali, Chennai - CPCB | 18.850000 |
| 9 | Borivali East, Mumbai - MPCB | 19.020000 |
| StationName | BTX | |
|---|---|---|
| 0 | T T Nagar, Bhopal - MPPCB | 0.000000 |
| 1 | Bandra, Mumbai - MPCB | 0.030000 |
| 2 | SIDCO Kurichi, Coimbatore - TNPCB | 0.200000 |
| 3 | Lodhi Road, Delhi - IMD | 1.960000 |
| 4 | North Campus, DU, Delhi - IMD | 2.380000 |
| 5 | CRRI Mathura Road, Delhi - IMD | 2.770000 |
| 6 | Bollaram Industrial Area, Hyderabad - TSPCB | 2.960000 |
| 7 | ICRISAT Patancheru, Hyderabad - TSPCB | 3.440000 |
| 8 | Pusa, Delhi - IMD | 3.710000 |
| 9 | Secretariat, Amaravati - APPCB | 3.860000 |
| City | PM2.5 | |
|---|---|---|
| 0 | Patna | 123.110000 |
| 1 | Gurugram | 117.340000 |
| 2 | Delhi | 117.150000 |
| 3 | Lucknow | 109.940000 |
| 4 | Ahmedabad | 67.820000 |
| 5 | Jorapokhar | 64.670000 |
| 6 | Brajrajnagar | 64.360000 |
| 7 | Kolkata | 64.120000 |
| 8 | Guwahati | 63.940000 |
| 9 | Talcher | 61.010000 |
| City | PM10 | |
|---|---|---|
| 0 | Delhi | 232.730000 |
| 1 | Gurugram | 192.490000 |
| 2 | Talcher | 165.290000 |
| 3 | Jorapokhar | 150.390000 |
| 4 | Patna | 126.910000 |
| 5 | Brajrajnagar | 124.940000 |
| 6 | Jaipur | 123.400000 |
| 7 | Bhopal | 119.210000 |
| 8 | Guwahati | 116.600000 |
| 9 | Kolkata | 115.260000 |
| City | NO | |
|---|---|---|
| 0 | Kochi | 71.370000 |
| 1 | Delhi | 38.980000 |
| 2 | Patna | 31.800000 |
| 3 | Talcher | 31.770000 |
| 4 | Mumbai | 31.560000 |
| 5 | Kolkata | 26.840000 |
| 6 | Ernakulam | 23.570000 |
| 7 | Ahmedabad | 22.590000 |
| 8 | Guwahati | 20.010000 |
| 9 | Brajrajnagar | 19.200000 |
| City | NO2 | |
|---|---|---|
| 0 | Ahmedabad | 58.850000 |
| 1 | Delhi | 50.800000 |
| 2 | Kolkata | 40.300000 |
| 3 | Patna | 37.560000 |
| 4 | Visakhapatnam | 37.040000 |
| 5 | Lucknow | 33.220000 |
| 6 | Jaipur | 32.360000 |
| 7 | Bhopal | 31.290000 |
| 8 | Coimbatore | 28.970000 |
| 9 | Hyderabad | 28.430000 |
| City | NOx | |
|---|---|---|
| 0 | Jorapokhar | 99.990000 |
| 1 | Kochi | 68.410000 |
| 2 | Kolkata | 63.340000 |
| 3 | Delhi | 58.570000 |
| 4 | Mumbai | 55.180000 |
| 5 | Ahmedabad | 47.370000 |
| 6 | Patna | 46.110000 |
| 7 | Guwahati | 44.250000 |
| 8 | Jaipur | 39.650000 |
| 9 | Amritsar | 35.690000 |
| City | NH3 | |
|---|---|---|
| 0 | Chennai | 63.400000 |
| 1 | Delhi | 41.990000 |
| 2 | Brajrajnagar | 36.960000 |
| 3 | Chandigarh | 30.600000 |
| 4 | Lucknow | 29.220000 |
| 5 | Ahmedabad | 26.640000 |
| 6 | Jaipur | 26.470000 |
| 7 | Gurugram | 26.210000 |
| 8 | Aizawl | 22.310000 |
| 9 | Bengaluru | 22.160000 |
| City | CO | |
|---|---|---|
| 0 | Ahmedabad | 22.360000 |
| 1 | Lucknow | 2.130000 |
| 2 | Delhi | 1.980000 |
| 3 | Talcher | 1.850000 |
| 4 | Bengaluru | 1.840000 |
| 5 | Brajrajnagar | 1.790000 |
| 6 | Ernakulam | 1.630000 |
| 7 | Patna | 1.500000 |
| 8 | Kochi | 1.300000 |
| 9 | Gurugram | 1.260000 |
| City | SO2 | |
|---|---|---|
| 0 | Ahmedabad | 55.250000 |
| 1 | Jorapokhar | 34.640000 |
| 2 | Talcher | 28.410000 |
| 3 | Patna | 22.020000 |
| 4 | Kochi | 17.600000 |
| 5 | Delhi | 15.900000 |
| 6 | Mumbai | 15.710000 |
| 7 | Guwahati | 14.660000 |
| 8 | Amaravati | 14.270000 |
| 9 | Bhopal | 13.080000 |
| City | O3 | |
|---|---|---|
| 0 | Bhopal | 59.940000 |
| 1 | Delhi | 51.290000 |
| 2 | Jaipur | 46.600000 |
| 3 | Ahmedabad | 39.310000 |
| 4 | Amaravati | 38.130000 |
| 5 | Visakhapatnam | 37.600000 |
| 6 | Patna | 37.070000 |
| 7 | Lucknow | 36.990000 |
| 8 | Thiruvananthapuram | 34.520000 |
| 9 | Gurugram | 34.250000 |
| City | BTX | |
|---|---|---|
| 0 | Kolkata | 38.110000 |
| 1 | Ahmedabad | 37.630000 |
| 2 | Delhi | 26.780000 |
| 3 | Thiruvananthapuram | 22.350000 |
| 4 | Patna | 17.270000 |
| 5 | Visakhapatnam | 15.080000 |
| 6 | Gurugram | 14.640000 |
| 7 | Amritsar | 14.500000 |
| 8 | Hyderabad | 10.720000 |
| 9 | Lucknow | 10.410000 |
| City | PM2.5 | |
|---|---|---|
| 0 | Aizawl | 16.850000 |
| 1 | Ernakulam | 24.960000 |
| 2 | Thiruvananthapuram | 27.990000 |
| 3 | Coimbatore | 29.730000 |
| 4 | Shillong | 30.290000 |
| 5 | Kochi | 31.430000 |
| 6 | Mumbai | 35.260000 |
| 7 | Bengaluru | 36.090000 |
| 8 | Amaravati | 37.640000 |
| 9 | Chandigarh | 41.060000 |
| City | PM10 | |
|---|---|---|
| 0 | Aizawl | 23.340000 |
| 1 | Coimbatore | 39.230000 |
| 2 | Shillong | 41.640000 |
| 3 | Ernakulam | 48.310000 |
| 4 | Thiruvananthapuram | 52.790000 |
| 5 | Chennai | 62.950000 |
| 6 | Kochi | 67.340000 |
| 7 | Amaravati | 76.310000 |
| 8 | Bengaluru | 83.590000 |
| 9 | Chandigarh | 85.660000 |
| City | NO | |
|---|---|---|
| 0 | Shillong | 0.920000 |
| 1 | Thiruvananthapuram | 3.440000 |
| 2 | Amaravati | 4.450000 |
| 3 | Bhopal | 7.020000 |
| 4 | Coimbatore | 7.530000 |
| 5 | Hyderabad | 7.830000 |
| 6 | Chennai | 9.190000 |
| 7 | Bengaluru | 9.400000 |
| 8 | Aizawl | 9.410000 |
| 9 | Chandigarh | 10.470000 |
| City | NO2 | |
|---|---|---|
| 0 | Aizawl | 0.390000 |
| 1 | Shillong | 2.770000 |
| 2 | Ernakulam | 3.630000 |
| 3 | Thiruvananthapuram | 9.370000 |
| 4 | Jorapokhar | 9.370000 |
| 5 | Chandigarh | 11.630000 |
| 6 | Guwahati | 13.560000 |
| 7 | Talcher | 13.770000 |
| 8 | Kochi | 14.860000 |
| 9 | Brajrajnagar | 16.530000 |
| City | NOx | |
|---|---|---|
| 0 | Shillong | 1.000000 |
| 1 | Thiruvananthapuram | 8.160000 |
| 2 | Aizawl | 12.610000 |
| 3 | Chandigarh | 15.070000 |
| 4 | Amaravati | 15.390000 |
| 5 | Chennai | 17.660000 |
| 6 | Hyderabad | 19.460000 |
| 7 | Bengaluru | 19.700000 |
| 8 | Bhopal | 22.380000 |
| 9 | Lucknow | 22.460000 |
| City | NH3 | |
|---|---|---|
| 0 | Shillong | 2.810000 |
| 1 | Thiruvananthapuram | 5.070000 |
| 2 | Jorapokhar | 7.000000 |
| 3 | Kochi | 7.980000 |
| 4 | Coimbatore | 9.400000 |
| 5 | Visakhapatnam | 10.970000 |
| 6 | Guwahati | 11.100000 |
| 7 | Talcher | 11.600000 |
| 8 | Amaravati | 12.030000 |
| 9 | Mumbai | 13.820000 |
| City | CO | |
|---|---|---|
| 0 | Shillong | 0.240000 |
| 1 | Aizawl | 0.280000 |
| 2 | Amritsar | 0.550000 |
| 3 | Mumbai | 0.570000 |
| 4 | Hyderabad | 0.590000 |
| 5 | Amaravati | 0.630000 |
| 6 | Chandigarh | 0.630000 |
| 7 | Jorapokhar | 0.650000 |
| 8 | Guwahati | 0.730000 |
| 9 | Visakhapatnam | 0.740000 |
| City | SO2 | |
|---|---|---|
| 0 | Ernakulam | 3.180000 |
| 1 | Bengaluru | 5.510000 |
| 2 | Thiruvananthapuram | 5.650000 |
| 3 | Shillong | 6.620000 |
| 4 | Aizawl | 7.380000 |
| 5 | Chennai | 7.870000 |
| 6 | Amritsar | 8.130000 |
| 7 | Kolkata | 8.530000 |
| 8 | Coimbatore | 8.590000 |
| 9 | Hyderabad | 9.190000 |
| City | O3 | |
|---|---|---|
| 0 | Aizawl | 3.570000 |
| 1 | Kochi | 3.820000 |
| 2 | Ernakulam | 5.960000 |
| 3 | Brajrajnagar | 16.850000 |
| 4 | Talcher | 17.570000 |
| 5 | Chandigarh | 20.050000 |
| 6 | Amritsar | 22.440000 |
| 7 | Guwahati | 25.060000 |
| 8 | Shillong | 27.690000 |
| 9 | Coimbatore | 28.820000 |
| City | BTX | |
|---|---|---|
| 0 | Mumbai | 0.030000 |
| 1 | Ernakulam | 2.010000 |
| 2 | Amaravati | 3.860000 |
| 3 | Aizawl | 6.190000 |
| 4 | Guwahati | 7.260000 |
| 5 | Brajrajnagar | 7.900000 |
| 6 | Chandigarh | 9.090000 |
| 7 | Coimbatore | 9.840000 |
| 8 | Lucknow | 10.410000 |
| 9 | Hyderabad | 10.720000 |
Filling the missing values with the help of median of each column.
| StationId | StationName | City | State | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 0.17 | 5.92 | 0.10 | 184.0 | Moderate |
| 1 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 0.20 | 6.50 | 0.06 | 184.0 | Moderate |
| 2 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 0.22 | 7.95 | 0.08 | 197.0 | Moderate |
| 3 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 0.29 | 7.63 | 0.12 | 198.0 | Moderate |
| 4 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 0.17 | 5.02 | 0.07 | 188.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 107706 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-27 | 8.65 | 16.46 | NaN | NaN | NaN | NaN | 0.69 | 4.36 | 30.59 | 1.32 | 7.26 | NaN | 50.0 | Good |
| 107707 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-28 | 11.80 | 18.47 | 13.65 | 200.87 | 214.20 | 11.40 | 0.68 | 3.49 | 38.95 | 1.42 | 7.92 | NaN | 65.0 | Satisfactory |
| 107708 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 3.52 | 8.64 | NaN | 63.0 | Satisfactory |
| 107709 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 1.86 | 8.40 | NaN | 57.0 | Satisfactory |
| 107710 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 1.31 | 7.39 | NaN | 59.0 | Satisfactory |
107711 rows × 19 columns
StationId 0 StationName 0 City 0 State 0 Date 0 PM2.5 20417 PM10 41789 NO 15629 NO2 15058 NOx 14346 NH3 47245 CO 11386 SO2 23922 O3 24213 Benzene 30164 Toluene 37453 Xylene 84595 AQI 18958 Air_quality 18958 dtype: int64
| PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 87294.000000 | 65922.000000 | 92082.000000 | 92653.000000 | 93365.000000 | 60466.000000 | 96325.000000 | 83789.000000 | 83498.000000 | 77547.000000 | 70258.000000 | 23116.000000 | 88753.000000 |
| mean | 80.344685 | 158.258377 | 23.065599 | 35.362506 | 41.214568 | 28.824049 | 1.628631 | 12.300694 | 38.221072 | 3.383664 | 15.552798 | 2.458169 | 179.803004 |
| std | 76.654693 | 123.416377 | 34.558080 | 29.746311 | 45.315124 | 24.998420 | 4.488459 | 13.196388 | 39.240139 | 11.300496 | 29.826612 | 6.734150 | 131.420900 |
| min | 0.020000 | 0.010000 | 0.010000 | 0.010000 | 0.000000 | 0.010000 | 0.000000 | 0.010000 | 0.010000 | 0.000000 | 0.000000 | 0.000000 | 8.000000 |
| 25% | 31.940000 | 70.490000 | 4.822500 | 15.130000 | 13.960000 | 11.930000 | 0.530000 | 5.040000 | 18.930000 | 0.160000 | 0.710000 | 0.000000 | 86.000000 |
| 50% | 56.010000 | 122.490000 | 10.270000 | 27.250000 | 26.620000 | 23.650000 | 0.910000 | 8.930000 | 30.880000 | 1.210000 | 4.380000 | 0.400000 | 133.000000 |
| 75% | 100.000000 | 208.967500 | 24.840000 | 47.030000 | 50.400000 | 38.200000 | 1.450000 | 14.900000 | 47.220000 | 3.610000 | 17.620000 | 2.120000 | 254.000000 |
| max | 1000.000000 | 1000.000000 | 470.000000 | 448.050000 | 467.630000 | 418.900000 | 175.810000 | 195.650000 | 963.000000 | 455.030000 | 454.850000 | 170.370000 | 2049.000000 |
StationId 0 StationName 0 City 0 State 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 4878 NH3 0 CO 7484 SO2 0 O3 0 Benzene 12876 Toluene 10550 Xylene 6146 AQI 0 Air_quality 0 dtype: int64
Checking the data via various methods for final visualisation
StationId 0 StationName 0 City 0 State 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 Air_quality 0 dtype: int64
StationId 0 StationName 0 City 0 State 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 Air_quality 0 dtype: int64
<class 'pandas.core.frame.DataFrame'> Int64Index: 107711 entries, 0 to 107710 Data columns (total 19 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 StationId 107711 non-null object 1 StationName 107711 non-null object 2 City 107711 non-null object 3 State 107711 non-null object 4 Date 107711 non-null datetime64[ns] 5 PM2.5 107711 non-null float64 6 PM10 107711 non-null float64 7 NO 107711 non-null float64 8 NO2 107711 non-null float64 9 NOx 107711 non-null float64 10 NH3 107711 non-null float64 11 CO 107711 non-null float64 12 SO2 107711 non-null float64 13 O3 107711 non-null float64 14 Benzene 107711 non-null float64 15 Toluene 107711 non-null float64 16 Xylene 107711 non-null float64 17 AQI 107711 non-null float64 18 Air_quality 107711 non-null object dtypes: datetime64[ns](1), float64(13), object(5) memory usage: 16.4+ MB
| PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 | 107711.000000 |
| mean | 75.731960 | 144.381199 | 21.208942 | 34.228378 | 40.476284 | 26.554569 | 1.615894 | 11.552082 | 36.570829 | 2.919585 | 12.096830 | 0.864530 | 171.565300 |
| std | 69.664187 | 98.111580 | 32.268880 | 27.731846 | 41.719310 | 18.905102 | 4.233512 | 11.723084 | 34.684843 | 9.604304 | 24.500655 | 3.227418 | 120.620077 |
| min | 0.020000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 0.010000 | 8.000000 |
| 25% | 37.100000 | 101.960000 | 5.660000 | 16.930000 | 18.230000 | 21.050000 | 0.690000 | 6.080000 | 22.390000 | 1.210000 | 4.380000 | 0.400000 | 95.000000 |
| 50% | 56.010000 | 122.490000 | 10.270000 | 27.250000 | 26.620000 | 23.650000 | 0.910000 | 8.930000 | 30.880000 | 1.210000 | 4.380000 | 0.400000 | 133.000000 |
| 75% | 84.700000 | 146.915000 | 21.010000 | 42.770000 | 45.380000 | 26.330000 | 1.360000 | 12.670000 | 41.440000 | 2.410000 | 8.200000 | 0.400000 | 216.000000 |
| max | 1000.000000 | 1000.000000 | 470.000000 | 448.050000 | 467.630000 | 418.900000 | 175.810000 | 195.650000 | 963.000000 | 455.030000 | 454.850000 | 170.370000 | 2049.000000 |
array(['Secretariat, Amaravati - APPCB',
'GVM Corporation, Visakhapatnam - APPCB',
'Railway Colony, Guwahati - APCB',
'DRM Office Danapur, Patna - BSPCB',
'Govt. High School Shikarpur, Patna - BSPCB',
'IGSC Planetarium Complex, Patna - BSPCB',
'Muradpur, Patna - BSPCB', 'Rajbansi Nagar, Patna - BSPCB',
'Samanpura, Patna - BSPCB', 'Sector-25, Chandigarh - CPCC',
'Alipur, Delhi - DPCC', 'Anand Vihar, Delhi - DPCC',
'Ashok Vihar, Delhi - DPCC', 'Aya Nagar, Delhi - IMD',
'Bawana, Delhi - DPCC', 'Burari Crossing, Delhi - IMD',
'CRRI Mathura Road, Delhi - IMD', 'DTU, Delhi - CPCB',
'Dr. Karni Singh Shooting Range, Delhi - DPCC',
'Dwarka-Sector 8, Delhi - DPCC', 'East Arjun Nagar, Delhi - CPCB',
'IGI Airport (T3), Delhi - IMD',
'IHBAS, Dilshad Garden, Delhi - CPCB', 'ITO, Delhi - CPCB',
'Jahangirpuri, Delhi - DPCC',
'Jawaharlal Nehru Stadium, Delhi - DPCC',
'Lodhi Road, Delhi - IMD',
'Major Dhyan Chand National Stadium, Delhi - DPCC',
'Mandir Marg, Delhi - DPCC', 'Mundka, Delhi - DPCC',
'NSIT Dwarka, Delhi - CPCB', 'Najafgarh, Delhi - DPCC',
'Narela, Delhi - DPCC', 'Nehru Nagar, Delhi - DPCC',
'North Campus, DU, Delhi - IMD', 'Okhla Phase-2, Delhi - DPCC',
'Patparganj, Delhi - DPCC', 'Punjabi Bagh, Delhi - DPCC',
'Pusa, Delhi - DPCC', 'Pusa, Delhi - IMD',
'R K Puram, Delhi - DPCC', 'Rohini, Delhi - DPCC',
'Shadipur, Delhi - CPCB', 'Sirifort, Delhi - CPCB',
'Sonia Vihar, Delhi - DPCC', 'Sri Aurobindo Marg, Delhi - DPCC',
'Vivek Vihar, Delhi - DPCC', 'Wazirpur, Delhi - DPCC',
'Maninagar, Ahmedabad - GPCB', 'NISE Gwal Pahari, Gurugram - IMD',
'Sector-51, Gurugram - HSPCB', 'Teri Gram, Gurugram - HSPCB',
'Vikas Sadan, Gurugram - HSPCB',
'Tata Stadium, Jorapokhar - JSPCB', 'BTM Layout, Bengaluru - CPCB',
'BWSSB Kadabesanahalli, Bengaluru - CPCB',
'Bapuji Nagar, Bengaluru - KSPCB',
'City Railway Station, Bengaluru - KSPCB',
'Hebbal, Bengaluru - KSPCB', 'Hombegowda Nagar, Bengaluru - KSPCB',
'Jayanagar 5th Block, Bengaluru - KSPCB',
'Peenya, Bengaluru - CPCB', 'Sanegurava Halli, Bengaluru - KSPCB',
'Silk Board, Bengaluru - KSPCB',
'Kariavattom, Thiruvananthapuram - Kerala PCB',
'Plammoodu, Thiruvananthapuram - Kerala PCB',
'T T Nagar, Bhopal - MPPCB', 'Bandra, Mumbai - MPCB',
'Borivali East, Mumbai - MPCB',
'Chhatrapati Shivaji Intl. Airport (T2), Mumbai - MPCB',
'Colaba, Mumbai - MPCB', 'Kurla, Mumbai - MPCB',
'Powai, Mumbai - MPCB', 'Sion, Mumbai - MPCB',
'Vasai West, Mumbai - MPCB', 'Vile Parle West, Mumbai - MPCB',
'Worli, Mumbai - MPCB', 'Lumpyngngad, Shillong - Meghalaya PCB',
'Sikulpuikawn, Aizawl - Mizoram PCB',
'GM Office, Brajrajnagar - OSPCB',
'Talcher Coalfields,Talcher - OSPCB',
'Golden Temple, Amritsar - PPCB', 'Adarsh Nagar, Jaipur - RSPCB',
'Police Commissionerate, Jaipur - RSPCB',
'Shastri Nagar, Jaipur - RSPCB',
'Alandur Bus Depot, Chennai - CPCB',
'Manali Village, Chennai - TNPCB', 'Manali, Chennai - CPCB',
'Velachery Res. Area, Chennai - CPCB',
'SIDCO Kurichi, Coimbatore - TNPCB',
'Bollaram Industrial Area, Hyderabad - TSPCB',
'Central University, Hyderabad - TSPCB',
'ICRISAT Patancheru, Hyderabad - TSPCB',
'IDA Pashamylaram, Hyderabad - TSPCB',
'Sanathnagar, Hyderabad - TSPCB', 'Zoo Park, Hyderabad - TSPCB',
'Central School, Lucknow - CPCB', 'Gomti Nagar, Lucknow - UPPCB',
'Lalbagh, Lucknow - CPCB', 'Nishant Ganj, Lucknow - UPPCB',
'Talkatora District Industries Center, Lucknow - CPCB',
'Ballygunge, Kolkata - WBPCB', 'Bidhannagar, Kolkata - WBPCB',
'Fort William, Kolkata - WBPCB', 'Jadavpur, Kolkata - WBPCB',
'Rabindra Bharati University, Kolkata - WBPCB',
'Rabindra Sarobar, Kolkata - WBPCB', 'Victoria, Kolkata - WBPCB'],
dtype=object)
IHBAS, Dilshad Garden, Delhi - CPCB 2009
Manali, Chennai - CPCB 2009
NSIT Dwarka, Delhi - CPCB 2009
Bandra, Mumbai - MPCB 2009
Maninagar, Ahmedabad - GPCB 2009
...
DRM Office Danapur, Patna - BSPCB 126
Govt. High School Shikarpur, Patna - BSPCB 121
Teri Gram, Gurugram - HSPCB 119
Sector-51, Gurugram - HSPCB 119
Sikulpuikawn, Aizawl - Mizoram PCB 113
Name: StationName, Length: 108, dtype: int64
| StationId | StationName | City | State | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | Pollution content | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-24 | 71.36 | 115.75 | 1.75 | 20.65 | 12.40 | 12.19 | 0.10 | 10.76 | 109.26 | 0.17 | 5.92 | 0.10 | 184.0 | Moderate | 360.41 |
| 1 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-25 | 81.40 | 124.50 | 1.44 | 20.50 | 12.08 | 10.72 | 0.12 | 15.24 | 127.09 | 0.20 | 6.50 | 0.06 | 184.0 | Moderate | 399.85 |
| 2 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-26 | 78.32 | 129.06 | 1.26 | 26.00 | 14.85 | 10.28 | 0.14 | 26.96 | 117.44 | 0.22 | 7.95 | 0.08 | 197.0 | Moderate | 412.56 |
| 3 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-27 | 88.76 | 135.32 | 6.60 | 30.85 | 21.77 | 12.91 | 0.11 | 33.59 | 111.81 | 0.29 | 7.63 | 0.12 | 198.0 | Moderate | 449.76 |
| 4 | AP001 | Secretariat, Amaravati - APPCB | Amaravati | Andhra Pradesh | 2017-11-28 | 64.18 | 104.09 | 2.56 | 28.07 | 17.01 | 11.42 | 0.09 | 19.00 | 138.18 | 0.17 | 5.02 | 0.07 | 188.0 | Moderate | 389.86 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 107706 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-27 | 8.65 | 16.46 | 10.27 | 27.25 | 26.62 | 23.65 | 0.69 | 4.36 | 30.59 | 1.32 | 7.26 | 0.40 | 50.0 | Good | 157.52 |
| 107707 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-28 | 11.80 | 18.47 | 13.65 | 200.87 | 214.20 | 11.40 | 0.68 | 3.49 | 38.95 | 1.42 | 7.92 | 0.40 | 65.0 | Satisfactory | 523.25 |
| 107708 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-29 | 18.60 | 32.26 | 13.65 | 200.87 | 214.20 | 11.40 | 0.78 | 5.12 | 38.17 | 3.52 | 8.64 | 0.40 | 63.0 | Satisfactory | 547.61 |
| 107709 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-06-30 | 16.07 | 39.30 | 7.56 | 29.13 | 36.69 | 29.26 | 0.69 | 5.88 | 29.64 | 1.86 | 8.40 | 0.40 | 57.0 | Satisfactory | 204.88 |
| 107710 | WB013 | Victoria, Kolkata - WBPCB | Kolkata | West Bengal | 2020-07-01 | 10.50 | 36.50 | 7.78 | 22.50 | 30.25 | 27.23 | 0.58 | 2.80 | 13.10 | 1.31 | 7.39 | 0.40 | 59.0 | Satisfactory | 160.34 |
107711 rows × 20 columns
Plotting of various aspects of data
For curing the error- "'Series' object has no attribute 'iplot'", we're using cufflinks library.
City 0 Date 0 PM2.5 4321 PM10 10866 NO 3276 NO2 3278 NOx 3980 NH3 10061 CO 1745 SO2 3510 O3 3664 Benzene 5298 Toluene 7739 Xylene 17878 AQI 4174 Air_quality 4174 dtype: int64
| PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 25210.000000 | 18665.000000 | 26255.000000 | 26253.000000 | 25551.000000 | 19470.000000 | 27786.000000 | 26021.000000 | 25867.000000 | 24233.000000 | 21792.000000 | 11653.000000 | 25357.000000 |
| mean | 67.444977 | 118.257649 | 17.664483 | 28.488332 | 32.327829 | 23.451706 | 2.254926 | 14.658364 | 34.448364 | 3.357247 | 8.736095 | 3.101291 | 166.489017 |
| std | 65.132855 | 90.949061 | 23.287194 | 24.504528 | 31.770902 | 25.655621 | 7.072508 | 18.488735 | 21.743220 | 16.114631 | 20.153048 | 6.789023 | 141.084091 |
| min | 0.040000 | 0.010000 | 0.020000 | 0.010000 | 0.000000 | 0.010000 | 0.000000 | 0.010000 | 0.010000 | 0.000000 | 0.000000 | 0.000000 | 13.000000 |
| 25% | 28.750000 | 56.200000 | 5.630000 | 11.690000 | 12.790000 | 8.490000 | 0.510000 | 5.660000 | 18.780000 | 0.120000 | 0.580000 | 0.130000 | 81.000000 |
| 50% | 48.485000 | 95.710000 | 9.880000 | 21.590000 | 23.500000 | 15.800000 | 0.890000 | 9.160000 | 30.790000 | 1.070000 | 2.960000 | 0.960000 | 118.000000 |
| 75% | 80.487500 | 149.890000 | 19.950000 | 37.520000 | 40.140000 | 30.000000 | 1.450000 | 15.290000 | 45.545000 | 3.090000 | 9.110000 | 3.330000 | 208.000000 |
| max | 949.990000 | 1000.000000 | 390.680000 | 362.210000 | 467.630000 | 352.890000 | 175.810000 | 193.860000 | 257.730000 | 455.030000 | 454.850000 | 170.370000 | 2049.000000 |
City 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 757 NH3 0 CO 2413 SO2 0 O3 0 Benzene 3913 Toluene 2943 Xylene 1812 AQI 0 Air_quality 0 dtype: int64
City 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 Air_quality 0 dtype: int64
City 0 Date 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 Air_quality 0 dtype: int64
<class 'pandas.core.frame.DataFrame'> RangeIndex: 29531 entries, 0 to 29530 Data columns (total 16 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 City 29531 non-null object 1 Date 29531 non-null datetime64[ns] 2 PM2.5 29531 non-null float64 3 PM10 29531 non-null float64 4 NO 29531 non-null float64 5 NO2 29531 non-null float64 6 NOx 29531 non-null float64 7 NH3 29531 non-null float64 8 CO 29531 non-null float64 9 SO2 29531 non-null float64 10 O3 29531 non-null float64 11 Benzene 29531 non-null float64 12 Toluene 29531 non-null float64 13 Xylene 29531 non-null float64 14 AQI 29531 non-null float64 15 Air_quality 29531 non-null object dtypes: datetime64[ns](1), float64(13), object(2) memory usage: 3.6+ MB
| PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.00000 | 29531.000000 | 29531.000000 | 29531.000000 | 29531.000000 |
| mean | 64.670738 | 109.961189 | 16.800917 | 27.722603 | 31.740472 | 20.844825 | 2.246994 | 14.004839 | 33.99446 | 3.088684 | 7.517378 | 1.863863 | 159.635434 |
| std | 60.551112 | 73.118152 | 22.093196 | 23.205864 | 29.303783 | 21.144911 | 6.849184 | 17.446163 | 20.38535 | 14.599926 | 17.397788 | 4.372919 | 131.820310 |
| min | 0.040000 | 0.010000 | 0.020000 | 0.010000 | 0.030000 | 0.010000 | 0.010000 | 0.010000 | 0.01000 | 0.010000 | 0.010000 | 0.010000 | 13.000000 |
| 25% | 31.870000 | 78.440000 | 6.140000 | 12.770000 | 15.710000 | 11.910000 | 0.680000 | 6.040000 | 20.42000 | 0.850000 | 2.620000 | 0.960000 | 87.000000 |
| 50% | 48.485000 | 95.710000 | 9.880000 | 21.590000 | 23.500000 | 15.800000 | 0.890000 | 9.160000 | 30.79000 | 1.070000 | 2.960000 | 0.960000 | 118.000000 |
| 75% | 72.920000 | 112.950000 | 17.760000 | 34.820000 | 36.255000 | 22.045000 | 1.400000 | 13.955000 | 43.00000 | 2.470000 | 6.110000 | 0.960000 | 182.000000 |
| max | 949.990000 | 1000.000000 | 390.680000 | 362.210000 | 467.630000 | 352.890000 | 175.810000 | 193.860000 | 257.73000 | 455.030000 | 454.850000 | 170.370000 | 2049.000000 |
We know that during covid there is less consumption of fuels in industrial and vehicles. So, we're going to interpret the data on the basis of two types of pollutants that formed during consumption and compare the data before-COVID and after-COVID.
Making two types of pollutant Groups
Vehicular Pollutant = PM2.5 + PM10 + NO + NO2 + NOx + NH3 + CO
Industrial Pollutant = SO2 + O3 + Benzene + Toluene + Xylene
<class 'pandas.core.frame.DataFrame'> Int64Index: 24908 entries, 0 to 29348 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 City 24908 non-null object 1 Date 24908 non-null datetime64[ns] 2 AQI 24908 non-null float64 3 Air_quality 24908 non-null object 4 Vehicular Pollutants 24908 non-null float64 5 Industrial Pollutants 24908 non-null float64 6 Total Pollutants 24908 non-null float64 dtypes: datetime64[ns](1), float64(4), object(2) memory usage: 1.5+ MB
| City | Date | AQI | Air_quality | Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | 118.0 | Moderate | 197.205 | 163.05 | 360.255 |
| 1 | Ahmedabad | 2015-01-02 | 118.0 | Moderate | 194.085 | 71.56 | 265.645 |
| 2 | Ahmedabad | 2015-01-03 | 118.0 | Moderate | 243.795 | 85.22 | 329.015 |
| 3 | Ahmedabad | 2015-01-04 | 118.0 | Moderate | 199.845 | 70.24 | 270.085 |
| 4 | Ahmedabad | 2015-01-05 | 118.0 | Moderate | 263.375 | 107.32 | 370.695 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 29344 | Visakhapatnam | 2019-12-28 | 110.0 | Moderate | 240.130 | 65.65 | 305.780 |
| 29345 | Visakhapatnam | 2019-12-29 | 133.0 | Moderate | 180.610 | 93.24 | 273.850 |
| 29346 | Visakhapatnam | 2019-12-30 | 92.0 | Satisfactory | 216.790 | 101.37 | 318.160 |
| 29347 | Visakhapatnam | 2019-12-31 | 92.0 | Satisfactory | 222.470 | 94.36 | 316.830 |
| 29348 | Visakhapatnam | 2020-01-01 | 111.0 | Moderate | 235.200 | 92.51 | 327.710 |
24908 rows × 7 columns
Pollutants Mean Variance 0 Vehicular Pollutants 284.058727 28761.160744 1 Industrial Pollutants 60.332131 1715.762624 2 Total Pollutants 344.390859 35606.345506
Normalization is generally required when we are dealing with attributes on a different scale, otherwise, it may lead to a dilution in effectiveness of an equally important attribute (on lower scale) because of other attribute having values on larger scale.
In simple words, when multiple attributes are there but attributes have values on different scales, this may lead to poor data models while performing data mining operations. So they are normalized to bring all the attributes on the same scale.
We're using the Standard Scaling method in which Mean is set to 0 and Variance to 1.
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 0 | 197.205 | 163.05 | 360.255 |
| 1 | 194.085 | 71.56 | 265.645 |
| 2 | 243.795 | 85.22 | 329.015 |
| 3 | 199.845 | 70.24 | 270.085 |
| 4 | 263.375 | 107.32 | 370.695 |
| ... | ... | ... | ... |
| 29344 | 240.130 | 65.65 | 305.780 |
| 29345 | 180.610 | 93.24 | 273.850 |
| 29346 | 216.790 | 101.37 | 318.160 |
| 29347 | 222.470 | 94.36 | 316.830 |
| 29348 | 235.200 | 92.51 | 327.710 |
24908 rows × 3 columns
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 0 | -0.512146 | 2.479854 | 0.084074 |
| 1 | -0.530544 | 0.271067 | -0.417323 |
| 2 | -0.237421 | 0.600852 | -0.081486 |
| 3 | -0.496579 | 0.239200 | -0.393793 |
| 4 | -0.121965 | 1.134399 | 0.139402 |
Vehicular Pollutants 2.464706e-16 Industrial Pollutants -2.707183e-16 Total Pollutants -3.765523e-17 dtype: float64
Vehicular Pollutants 1.00002 Industrial Pollutants 1.00002 Total Pollutants 1.00002 dtype: float64
Vehicular Pollutants 2.354917 Industrial Pollutants 2.941972 Total Pollutants 2.157560 dtype: float64
Vehicular Pollutants 2.354917 Industrial Pollutants 2.941972 Total Pollutants 2.157560 dtype: float64
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| count | 24908.000000 | 24908.000000 | 24908.000000 |
| mean | 284.058727 | 60.332131 | 344.390859 |
| std | 169.591158 | 41.421765 | 188.696437 |
| min | 7.880000 | 3.340000 | 21.570000 |
| 25% | 192.005000 | 37.290000 | 235.686250 |
| 50% | 231.715000 | 46.050000 | 285.120000 |
| 75% | 332.065000 | 70.880000 | 409.451250 |
| max | 2137.260000 | 776.150000 | 2326.750000 |
array([[<AxesSubplot:title={'center':'Vehicular Pollutants'}>,
<AxesSubplot:title={'center':'Industrial Pollutants'}>],
[<AxesSubplot:title={'center':'Total Pollutants'}>,
<AxesSubplot:>]], dtype=object)
array([[<AxesSubplot:title={'center':'Vehicular Pollutants'}>,
<AxesSubplot:title={'center':'Industrial Pollutants'}>],
[<AxesSubplot:title={'center':'Total Pollutants'}>,
<AxesSubplot:>]], dtype=object)
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 0 | 197.205 | 163.05 | 360.255 |
| 1 | 194.085 | 71.56 | 265.645 |
| 2 | 243.795 | 85.22 | 329.015 |
| 3 | 199.845 | 70.24 | 270.085 |
| 4 | 263.375 | 107.32 | 370.695 |
| ... | ... | ... | ... |
| 29344 | 240.130 | 65.65 | 305.780 |
| 29345 | 180.610 | 93.24 | 273.850 |
| 29346 | 216.790 | 101.37 | 318.160 |
| 29347 | 222.470 | 94.36 | 316.830 |
| 29348 | 235.200 | 92.51 | 327.710 |
24908 rows × 3 columns
Hypothesis Test is performed on the Total Pollutants *In which we're saying that the alpha or p_value critical is taken as 0.05 as per the data.**
We know that the if the pollutants is greater than 320 ug / m3, then it will be considered as Bad quality air.
So by considering the above statement we're taking the hypothesis:
Null Hypothesis(H0) >= 320 Alternate Hypothesis(H1) < 320
Calculating the Z Score value:
0.12925977347574807
Calculating the p-value:
0.4485760502293482
Here we're getting that p-value is greater than 0.05 which means that our Null Hypothesis is right.
This shows that the air pollution is poor before covid, which is logical also.
Making two types of pollutant Groups
Vehicular Pollutant = PM2.5 + PM10 + NO + NO2 + NOx + NH3 + CO
Industrial Pollutant = SO2 + O3 + Benzene + Toluene + Xylene
<class 'pandas.core.frame.DataFrame'> Int64Index: 4623 entries, 1827 to 29530 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 City 4623 non-null object 1 Date 4623 non-null datetime64[ns] 2 AQI 4623 non-null float64 3 Air_quality 4623 non-null object 4 Vehicular Pollutants 4623 non-null float64 5 Industrial Pollutants 4623 non-null float64 6 Total Pollutants 4623 non-null float64 dtypes: datetime64[ns](1), float64(4), object(2) memory usage: 288.9+ KB
| City | Date | AQI | Air_quality | Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|---|---|---|---|
| 1827 | Ahmedabad | 2020-01-02 | 162.0 | Moderate | 248.62 | 85.00 | 333.62 |
| 1828 | Ahmedabad | 2020-01-03 | 220.0 | Poor | 256.23 | 97.88 | 354.11 |
| 1829 | Ahmedabad | 2020-01-04 | 254.0 | Poor | 276.04 | 100.41 | 376.45 |
| 1830 | Ahmedabad | 2020-01-05 | 255.0 | Poor | 219.89 | 106.40 | 326.29 |
| 1831 | Ahmedabad | 2020-01-06 | 175.0 | Moderate | 217.00 | 98.16 | 315.16 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 41.0 | Good | 131.18 | 46.89 | 178.07 |
| 29527 | Visakhapatnam | 2020-06-28 | 70.0 | Satisfactory | 156.99 | 46.19 | 203.18 |
| 29528 | Visakhapatnam | 2020-06-29 | 68.0 | Satisfactory | 151.14 | 40.36 | 191.50 |
| 29529 | Visakhapatnam | 2020-06-30 | 54.0 | Satisfactory | 129.27 | 43.13 | 172.40 |
| 29530 | Visakhapatnam | 2020-07-01 | 50.0 | Good | 128.09 | 24.14 | 152.23 |
4623 rows × 7 columns
Before doing the hypothesis testing we are doing the normalisation and standardisation of dataset
Pollutants Mean Variance 0 Vehicular Pollutants 219.726823 18957.688067 1 Industrial Pollutants 61.207863 4191.823405 2 Total Pollutants 280.934686 23610.013374
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 1827 | 248.62 | 85.00 | 333.62 |
| 1828 | 256.23 | 97.88 | 354.11 |
| 1829 | 276.04 | 100.41 | 376.45 |
| 1830 | 219.89 | 106.40 | 326.29 |
| 1831 | 217.00 | 98.16 | 315.16 |
| ... | ... | ... | ... |
| 29526 | 131.18 | 46.89 | 178.07 |
| 29527 | 156.99 | 46.19 | 203.18 |
| 29528 | 151.14 | 40.36 | 191.50 |
| 29529 | 129.27 | 43.13 | 172.40 |
| 29530 | 128.09 | 24.14 | 152.23 |
4623 rows × 3 columns
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 0 | 0.209870 | 0.367518 | 0.342917 |
| 1 | 0.265146 | 0.566476 | 0.476281 |
| 2 | 0.409039 | 0.605557 | 0.621687 |
| 3 | 0.001185 | 0.698085 | 0.295207 |
| 4 | -0.019807 | 0.570801 | 0.222765 |
Vehicular Pollutants 1.229579e-17 Industrial Pollutants -4.303525e-17 Total Pollutants 6.762682e-17 dtype: float64
Vehicular Pollutants 1.000108 Industrial Pollutants 1.000108 Total Pollutants 1.000108 dtype: float64
Vehicular Pollutants 1.872608 Industrial Pollutants 10.533733 Total Pollutants 1.835976 dtype: float64
Vehicular Pollutants 1.872608 Industrial Pollutants 10.533733 Total Pollutants 1.835976 dtype: float64
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| count | 4623.000000 | 4623.000000 | 4623.000000 |
| mean | 219.726823 | 61.207863 | 280.934686 |
| std | 137.686920 | 64.744292 | 153.655502 |
| min | 15.790000 | 3.980000 | 38.910000 |
| 25% | 126.520000 | 37.800000 | 174.445000 |
| 50% | 189.730000 | 50.840000 | 251.860000 |
| 75% | 270.150000 | 71.225000 | 342.095000 |
| max | 1266.900000 | 969.380000 | 1298.770000 |
array([[<AxesSubplot:title={'center':'Vehicular Pollutants'}>,
<AxesSubplot:title={'center':'Industrial Pollutants'}>],
[<AxesSubplot:title={'center':'Total Pollutants'}>,
<AxesSubplot:>]], dtype=object)
array([[<AxesSubplot:title={'center':'Vehicular Pollutants'}>,
<AxesSubplot:title={'center':'Industrial Pollutants'}>],
[<AxesSubplot:title={'center':'Total Pollutants'}>,
<AxesSubplot:>]], dtype=object)
| Vehicular Pollutants | Industrial Pollutants | Total Pollutants | |
|---|---|---|---|
| 1827 | 248.62 | 85.00 | 333.62 |
| 1828 | 256.23 | 97.88 | 354.11 |
| 1829 | 276.04 | 100.41 | 376.45 |
| 1830 | 219.89 | 106.40 | 326.29 |
| 1831 | 217.00 | 98.16 | 315.16 |
| ... | ... | ... | ... |
| 29526 | 131.18 | 46.89 | 178.07 |
| 29527 | 156.99 | 46.19 | 203.18 |
| 29528 | 151.14 | 40.36 | 191.50 |
| 29529 | 129.27 | 43.13 | 172.40 |
| 29530 | 128.09 | 24.14 | 152.23 |
4623 rows × 3 columns
Hypothesis Test is performed on the Total Pollutants *In which we're saying that the alpha or p_value critical is taken as 0.05 as per the data.**
We know that the if the pollutants is greater than 320 ug / m3, then it will be considered as Bad quality air.
So by considering the above statement we're taking the hypothesis:
Null Hypothesis(H0) >= 320 Alternate Hypothesis(H1) < 320
Calculating the Z-Score value:
-0.25423960141421537
Calculating the p-value:
0.39965522897962547
Here we're getting that p-value is greater than 0.05 which means that our Null Hypothesis is right.
This shows that the air pollution is poor after covid also, which is again logical.
Here, we're going to predict the AQI in two ways:
1) Calculating the AQI as whole of India yearly by using Linear Regression model
2) Calculating the AQI for each city for upcoming years
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | 48.485 | 95.71 | 0.92 | 18.22 | 17.15 | 15.80 | 0.92 | 27.64 | 133.36 | 1.07 | 0.02 | 0.96 | 118.0 | Moderate |
| 1 | Ahmedabad | 2015-01-02 | 48.485 | 95.71 | 0.97 | 15.69 | 16.46 | 15.80 | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | 118.0 | Moderate |
| 2 | Ahmedabad | 2015-01-03 | 48.485 | 95.71 | 17.40 | 19.30 | 29.70 | 15.80 | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | 118.0 | Moderate |
| 3 | Ahmedabad | 2015-01-04 | 48.485 | 95.71 | 1.70 | 18.48 | 17.97 | 15.80 | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | 118.0 | Moderate |
| 4 | Ahmedabad | 2015-01-05 | 48.485 | 95.71 | 22.10 | 21.42 | 37.76 | 15.80 | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | 118.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 15.020 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 2.24 | 12.07 | 0.73 | 41.0 | Good |
| 29527 | Visakhapatnam | 2020-06-28 | 24.380 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 0.74 | 2.21 | 0.38 | 70.0 | Satisfactory |
| 29528 | Visakhapatnam | 2020-06-29 | 22.910 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 0.01 | 0.01 | 0.96 | 68.0 | Satisfactory |
| 29529 | Visakhapatnam | 2020-06-30 | 16.640 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 1.07 | 2.96 | 0.96 | 54.0 | Satisfactory |
| 29530 | Visakhapatnam | 2020-07-01 | 15.000 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | 1.07 | 2.96 | 0.96 | 50.0 | Good |
29531 rows × 16 columns
<class 'pandas.core.frame.DataFrame'> RangeIndex: 29531 entries, 0 to 29530 Data columns (total 16 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 City 29531 non-null object 1 Date 29531 non-null datetime64[ns] 2 PM2.5 29531 non-null float64 3 PM10 29531 non-null float64 4 NO 29531 non-null float64 5 NO2 29531 non-null float64 6 NOx 29531 non-null float64 7 NH3 29531 non-null float64 8 CO 29531 non-null float64 9 SO2 29531 non-null float64 10 O3 29531 non-null float64 11 Benzene 29531 non-null float64 12 Toluene 29531 non-null float64 13 Xylene 29531 non-null float64 14 AQI 29531 non-null float64 15 Air_quality 29531 non-null object dtypes: datetime64[ns](1), float64(13), object(2) memory usage: 3.6+ MB
| Date | Month | Year | AQI_avg | |
|---|---|---|---|---|
| 0 | 2015-01-01 | 1 | 2015 | 177.000000 |
| 1 | 2015-01-02 | 1 | 2015 | 174.000000 |
| 2 | 2015-01-03 | 1 | 2015 | 122.166667 |
| 3 | 2015-01-04 | 1 | 2015 | 146.714286 |
| 4 | 2015-01-05 | 1 | 2015 | 147.571429 |
| ... | ... | ... | ... | ... |
| 2004 | 2020-06-27 | 6 | 2020 | 74.346154 |
| 2005 | 2020-06-28 | 6 | 2020 | 79.038462 |
| 2006 | 2020-06-29 | 6 | 2020 | 78.000000 |
| 2007 | 2020-06-30 | 6 | 2020 | 72.230769 |
| 2008 | 2020-07-01 | 7 | 2020 | 89.615385 |
2009 rows × 4 columns
| Date | Month | Year | AQI_avg | |
|---|---|---|---|---|
| 0 | 2015-01-01 | 1 | 2015 | 177.000000 |
| 1 | 2015-01-02 | 1 | 2015 | 174.000000 |
| 2 | 2015-01-03 | 1 | 2015 | 122.166667 |
| 3 | 2015-01-04 | 1 | 2015 | 146.714286 |
| 4 | 2015-01-05 | 1 | 2015 | 147.571429 |
| ... | ... | ... | ... | ... |
| 2004 | 2020-06-27 | 6 | 2020 | 74.346154 |
| 2005 | 2020-06-28 | 6 | 2020 | 79.038462 |
| 2006 | 2020-06-29 | 6 | 2020 | 78.000000 |
| 2007 | 2020-06-30 | 6 | 2020 | 72.230769 |
| 2008 | 2020-07-01 | 7 | 2020 | 89.615385 |
2009 rows × 4 columns
<AxesSubplot:xlabel='AQI_avg'>
array([[ 1. , 1.33630621],
[ 1. , 0.80178373],
[ 1. , 0.26726124],
[ 1. , -0.26726124],
[ 1. , -0.80178373],
[ 1. , -1.33630621]])
Gradient Descent: 161.82, -20.54
| Year | AQI_avg | Actual | Predicted | |
|---|---|---|---|---|
| 5 | 2020 | 114.927696 | 114.927696 | 134.372270 |
| 4 | 2019 | 156.116369 | 156.116369 | 145.351362 |
| 3 | 2018 | 176.373409 | 176.373409 | 156.330454 |
| 2 | 2017 | 164.348404 | 164.348404 | 167.309546 |
| 1 | 2016 | 178.330358 | 178.330358 | 178.288638 |
| 0 | 2015 | 180.848065 | 180.848065 | 189.267730 |
12.749887150652524
| Date | Month | Year | AQI_avg | |
|---|---|---|---|---|
| 0 | 2015-01-01 | 1 | 2015 | 177.000000 |
| 1 | 2015-01-02 | 1 | 2015 | 174.000000 |
| 2 | 2015-01-03 | 1 | 2015 | 122.166667 |
| 3 | 2015-01-04 | 1 | 2015 | 146.714286 |
| 4 | 2015-01-05 | 1 | 2015 | 147.571429 |
| ... | ... | ... | ... | ... |
| 1822 | 2019-12-28 | 12 | 2019 | 183.434783 |
| 1823 | 2019-12-29 | 12 | 2019 | 204.304348 |
| 1824 | 2019-12-30 | 12 | 2019 | 221.565217 |
| 1825 | 2019-12-31 | 12 | 2019 | 204.956522 |
| 1826 | 2020-01-01 | 1 | 2020 | 211.217391 |
1827 rows × 4 columns
<AxesSubplot:xlabel='AQI_avg'>
array([[ 1. , 1.33630621],
[ 1. , 0.80178373],
[ 1. , 0.26726124],
[ 1. , -0.26726124],
[ 1. , -0.80178373],
[ 1. , -1.33630621]])
Gradient Descent: 177.87, 5.20
| Year | AQI_avg | Actual | Predicted | |
|---|---|---|---|---|
| 5 | 2020 | 114.927696 | 211.217391 | 184.818792 |
| 4 | 2019 | 156.116369 | 156.116369 | 182.039275 |
| 3 | 2018 | 176.373409 | 176.373409 | 179.259758 |
| 2 | 2017 | 164.348404 | 164.348404 | 176.480242 |
| 1 | 2016 | 178.330358 | 178.330358 | 173.700725 |
| 0 | 2015 | 180.848065 | 180.848065 | 170.921208 |
16.55481596210976
| Date | Month | Year | AQI_avg | |
|---|---|---|---|---|
| 1826 | 2020-01-01 | 1 | 2020 | 211.217391 |
| 1827 | 2020-01-02 | 1 | 2020 | 187.260870 |
| 1828 | 2020-01-03 | 1 | 2020 | 163.739130 |
| 1829 | 2020-01-04 | 1 | 2020 | 146.956522 |
| 1830 | 2020-01-05 | 1 | 2020 | 160.521739 |
| ... | ... | ... | ... | ... |
| 2004 | 2020-06-27 | 6 | 2020 | 74.346154 |
| 2005 | 2020-06-28 | 6 | 2020 | 79.038462 |
| 2006 | 2020-06-29 | 6 | 2020 | 78.000000 |
| 2007 | 2020-06-30 | 6 | 2020 | 72.230769 |
| 2008 | 2020-07-01 | 7 | 2020 | 89.615385 |
183 rows × 4 columns
<class 'pandas.core.frame.DataFrame'> Int64Index: 183 entries, 1826 to 2008 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Date 183 non-null datetime64[ns] 1 Month 183 non-null int64 2 Year 183 non-null int64 3 AQI_avg 183 non-null float64 dtypes: datetime64[ns](1), float64(1), int64(2) memory usage: 7.1 KB
<AxesSubplot:xlabel='AQI_avg'>
array([[ 1. , 1.38873015],
[ 1. , 0.9258201 ],
[ 1. , 0.46291005],
[ 1. , 0. ],
[ 1. , -0.46291005],
[ 1. , -0.9258201 ],
[ 1. , -1.38873015]])
Gradient Descent: 111.53, -31.99
| Year | AQI_avg | Actual | Predicted | |
|---|---|---|---|---|
| 0 | 2015.0 | 180.848065 | 167.450940 | 155.955477 |
| 1 | 2016.0 | 178.330358 | 157.586207 | 141.146985 |
| 2 | 2017.0 | 164.348404 | 110.869132 | 126.338492 |
| 3 | 2018.0 | 176.373409 | 88.646154 | 111.530000 |
| 4 | 2019.0 | 156.116369 | 88.243176 | 96.721508 |
| 5 | 2020.0 | 114.927696 | 78.310256 | 81.913015 |
| 6 | NaN | NaN | 89.615385 | 67.104523 |
15.84282666114951
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | 48.485 | 95.71 | 0.92 | 18.22 | 17.15 | 15.80 | 0.92 | 27.64 | 133.36 | 1.07 | 0.02 | 0.96 | 118.0 | Moderate |
| 1 | Ahmedabad | 2015-01-02 | 48.485 | 95.71 | 0.97 | 15.69 | 16.46 | 15.80 | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | 118.0 | Moderate |
| 2 | Ahmedabad | 2015-01-03 | 48.485 | 95.71 | 17.40 | 19.30 | 29.70 | 15.80 | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | 118.0 | Moderate |
| 3 | Ahmedabad | 2015-01-04 | 48.485 | 95.71 | 1.70 | 18.48 | 17.97 | 15.80 | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | 118.0 | Moderate |
| 4 | Ahmedabad | 2015-01-05 | 48.485 | 95.71 | 22.10 | 21.42 | 37.76 | 15.80 | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | 118.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 2020-06-27 | 15.020 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 2.24 | 12.07 | 0.73 | 41.0 | Good |
| 29527 | Visakhapatnam | 2020-06-28 | 24.380 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 0.74 | 2.21 | 0.38 | 70.0 | Satisfactory |
| 29528 | Visakhapatnam | 2020-06-29 | 22.910 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 0.01 | 0.01 | 0.96 | 68.0 | Satisfactory |
| 29529 | Visakhapatnam | 2020-06-30 | 16.640 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 1.07 | 2.96 | 0.96 | 54.0 | Satisfactory |
| 29530 | Visakhapatnam | 2020-07-01 | 15.000 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | 1.07 | 2.96 | 0.96 | 50.0 | Good |
29531 rows × 16 columns
array(['Ahmedabad', 'Aizawl', 'Amaravati', 'Amritsar', 'Bengaluru',
'Bhopal', 'Brajrajnagar', 'Chandigarh', 'Chennai', 'Coimbatore',
'Delhi', 'Ernakulam', 'Gurugram', 'Guwahati', 'Hyderabad',
'Jaipur', 'Jorapokhar', 'Kochi', 'Kolkata', 'Lucknow', 'Mumbai',
'Patna', 'Shillong', 'Talcher', 'Thiruvananthapuram',
'Visakhapatnam'], dtype=object)
Ahmedabad 2009 Delhi 2009 Mumbai 2009 Bengaluru 2009 Lucknow 2009 Chennai 2009 Hyderabad 2006 Patna 1858 Gurugram 1679 Visakhapatnam 1462 Amritsar 1221 Jorapokhar 1169 Jaipur 1114 Thiruvananthapuram 1112 Amaravati 951 Brajrajnagar 938 Talcher 925 Kolkata 814 Guwahati 502 Coimbatore 386 Shillong 310 Chandigarh 304 Bhopal 289 Ernakulam 162 Kochi 162 Aizawl 113 Name: City, dtype: int64
The AQI of India seems to vary sporadically between local regions but,as we saw, possesses a seasonal rally trend in the monsoon. For this reason, Prophet was chosen as it has excellent seasonality learning capabilities in time-series analaysis.
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10229 | Delhi | 2015-01-01 | 313.22 | 607.98 | 69.16 | 36.39 | 110.59 | 33.85 | 15.20 | 9.25 | 41.68 | 14.36 | 24.86 | 9.84 | 472.0 | Severe |
| 10230 | Delhi | 2015-01-02 | 186.18 | 269.55 | 62.09 | 32.87 | 88.14 | 31.83 | 9.54 | 6.65 | 29.97 | 10.55 | 20.09 | 4.29 | 454.0 | Severe |
| 10231 | Delhi | 2015-01-03 | 87.18 | 131.90 | 25.73 | 30.31 | 47.95 | 69.55 | 10.61 | 2.65 | 19.71 | 3.91 | 10.23 | 1.99 | 143.0 | Moderate |
| 10232 | Delhi | 2015-01-04 | 151.84 | 241.84 | 25.01 | 36.91 | 48.62 | 130.36 | 11.54 | 4.63 | 25.36 | 4.26 | 9.71 | 3.34 | 319.0 | Very Poor |
| 10233 | Delhi | 2015-01-05 | 146.60 | 219.13 | 14.01 | 34.92 | 38.25 | 122.88 | 9.20 | 3.33 | 23.20 | 2.80 | 6.21 | 2.96 | 325.0 | Very Poor |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 12233 | Delhi | 2020-06-27 | 39.80 | 155.94 | 10.88 | 21.46 | 22.47 | 31.43 | 0.87 | 10.38 | 18.88 | 1.69 | 19.99 | 0.43 | 112.0 | Moderate |
| 12234 | Delhi | 2020-06-28 | 59.52 | 308.65 | 12.67 | 21.60 | 23.86 | 29.27 | 0.94 | 10.70 | 18.05 | 1.71 | 25.13 | 1.74 | 196.0 | Moderate |
| 12235 | Delhi | 2020-06-29 | 44.86 | 184.12 | 10.50 | 21.57 | 21.94 | 27.97 | 0.88 | 11.58 | 26.61 | 2.13 | 23.80 | 1.13 | 233.0 | Poor |
| 12236 | Delhi | 2020-06-30 | 39.80 | 91.98 | 5.99 | 17.96 | 15.44 | 28.48 | 0.84 | 10.51 | 37.29 | 1.57 | 16.37 | 0.49 | 114.0 | Moderate |
| 12237 | Delhi | 2020-07-01 | 54.01 | 128.66 | 6.33 | 21.05 | 16.81 | 29.06 | 0.97 | 11.15 | 29.73 | 2.03 | 23.57 | 0.65 | 101.0 | Moderate |
2009 rows × 16 columns
| Date | AQI | |
|---|---|---|
| 0 | 2015-01-01 | 472.0 |
| 1 | 2015-01-02 | 454.0 |
| 2 | 2015-01-03 | 143.0 |
| 3 | 2015-01-04 | 319.0 |
| 4 | 2015-01-05 | 325.0 |
| ... | ... | ... |
| 2004 | 2020-06-27 | 112.0 |
| 2005 | 2020-06-28 | 196.0 |
| 2006 | 2020-06-29 | 233.0 |
| 2007 | 2020-06-30 | 114.0 |
| 2008 | 2020-07-01 | 101.0 |
2009 rows × 2 columns
01:18:43 - cmdstanpy - INFO - Chain [1] start processing 01:18:44 - cmdstanpy - INFO - Chain [1] done processing
<prophet.forecaster.Prophet at 0x1ee27236140>
| ds | |
|---|---|
| 2369 | 2021-06-27 |
| 2370 | 2021-06-28 |
| 2371 | 2021-06-29 |
| 2372 | 2021-06-30 |
| 2373 | 2021-07-01 |
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 2369 | 2021-06-27 | 15.151967 | -72.179657 | 107.341479 |
| 2370 | 2021-06-28 | 8.304171 | -81.092528 | 98.563748 |
| 2371 | 2021-06-29 | 7.531971 | -81.042863 | 95.885354 |
| 2372 | 2021-06-30 | 8.359923 | -80.456249 | 92.903216 |
| 2373 | 2021-07-01 | 8.126453 | -76.932192 | 108.591637 |
0%| | 0/5 [00:00<?, ?it/s]
01:18:56 - cmdstanpy - INFO - Chain [1] start processing 01:18:56 - cmdstanpy - INFO - Chain [1] done processing 01:19:06 - cmdstanpy - INFO - Chain [1] start processing 01:19:07 - cmdstanpy - INFO - Chain [1] done processing 01:19:16 - cmdstanpy - INFO - Chain [1] start processing 01:19:17 - cmdstanpy - INFO - Chain [1] done processing 01:19:25 - cmdstanpy - INFO - Chain [1] start processing 01:19:26 - cmdstanpy - INFO - Chain [1] done processing 01:19:36 - cmdstanpy - INFO - Chain [1] start processing 01:19:36 - cmdstanpy - INFO - Chain [1] done processing
Cross Validation accuracy: 67.97047760703944
By using this model we are getting the accuracy of 67.97 for Delhi which is really good for using this model. This model can be used for predicting the trends in the AQI for Delhi of upcoming years.
Printing the trend of AQI in Delhi for upcoming years and finding the yearly, monthly, and weekly behaviour.
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 23864 | Patna | 2015-06-01 | 48.485 | 95.71 | 14.41 | 25.06 | 39.32 | 15.80 | 1.56 | 1.80 | 8.89 | 1.07 | 0.29 | 0.96 | 118.0 | Moderate |
| 23865 | Patna | 2015-06-02 | 48.485 | 95.71 | 25.00 | 22.48 | 47.50 | 15.80 | 2.35 | 9.69 | 9.90 | 0.08 | 0.83 | 0.09 | 118.0 | Moderate |
| 23866 | Patna | 2015-06-03 | 48.485 | 95.71 | 14.29 | 17.16 | 29.81 | 15.80 | 1.69 | 20.61 | 12.63 | 1.07 | 0.33 | 0.96 | 118.0 | Moderate |
| 23867 | Patna | 2015-06-04 | 48.485 | 95.71 | 13.03 | 15.62 | 28.63 | 15.80 | 1.20 | 4.35 | 9.77 | 0.01 | 0.28 | 0.96 | 118.0 | Moderate |
| 23868 | Patna | 2015-06-05 | 48.485 | 95.71 | 10.40 | 10.36 | 20.14 | 15.80 | 1.29 | 7.22 | 11.90 | 1.07 | 0.15 | 0.96 | 118.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 25717 | Patna | 2020-06-27 | 17.710 | 63.73 | 9.47 | 23.01 | 22.28 | 1.91 | 0.87 | 3.63 | 23.39 | 1.09 | 3.07 | 0.97 | 65.0 | Satisfactory |
| 25718 | Patna | 2020-06-28 | 19.270 | 57.42 | 30.19 | 18.13 | 36.76 | 2.05 | 0.72 | 3.92 | 17.37 | 1.18 | 2.90 | 1.24 | 82.0 | Satisfactory |
| 25719 | Patna | 2020-06-29 | 17.240 | 42.83 | 42.40 | 20.51 | 47.69 | 2.26 | 0.88 | 3.60 | 17.50 | 1.51 | 4.91 | 1.74 | 88.0 | Satisfactory |
| 25720 | Patna | 2020-06-30 | 29.760 | 60.68 | 42.12 | 27.50 | 52.04 | 1.59 | 0.83 | 3.91 | 21.70 | 1.58 | 8.59 | 2.02 | 93.0 | Satisfactory |
| 25721 | Patna | 2020-07-01 | 35.420 | 57.82 | 44.50 | 31.15 | 57.72 | 1.14 | 0.82 | 3.99 | 25.76 | 1.73 | 5.50 | 2.14 | 98.0 | Satisfactory |
1858 rows × 16 columns
01:19:47 - cmdstanpy - INFO - Chain [1] start processing 01:19:48 - cmdstanpy - INFO - Chain [1] done processing
<prophet.forecaster.Prophet at 0x1ee26e83cd0>
| ds | |
|---|---|
| 2218 | 2021-06-27 |
| 2219 | 2021-06-28 |
| 2220 | 2021-06-29 |
| 2221 | 2021-06-30 |
| 2222 | 2021-07-01 |
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 2218 | 2021-06-27 | 11.754445 | -62.226604 | 82.207675 |
| 2219 | 2021-06-28 | 8.512360 | -67.814273 | 82.807106 |
| 2220 | 2021-06-29 | 13.255852 | -59.275857 | 86.541985 |
| 2221 | 2021-06-30 | 13.006771 | -59.560854 | 85.133394 |
| 2222 | 2021-07-01 | 9.804651 | -63.738632 | 83.167927 |
0%| | 0/4 [00:00<?, ?it/s]
01:19:59 - cmdstanpy - INFO - Chain [1] start processing 01:19:59 - cmdstanpy - INFO - Chain [1] done processing 01:20:09 - cmdstanpy - INFO - Chain [1] start processing 01:20:10 - cmdstanpy - INFO - Chain [1] done processing 01:20:17 - cmdstanpy - INFO - Chain [1] start processing 01:20:17 - cmdstanpy - INFO - Chain [1] done processing 01:20:27 - cmdstanpy - INFO - Chain [1] start processing 01:20:28 - cmdstanpy - INFO - Chain [1] done processing
Cross Validation accuracy: 63.38596674035115
By using this model we are getting the accuracy of 63.386 for Patna which is really good for using this model. This model can be used for predicting the trends in the AQI for Patna of upcoming years.
Printing the trend of AQI in Patna for upcoming years and finding the yearly, monthly, and weekly behaviour.
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4294 | Bengaluru | 2015-01-01 | 48.485 | 95.71 | 3.26 | 17.33 | 10.88 | 20.36 | 0.33 | 3.54 | 10.73 | 0.56 | 4.64 | 0.96 | 118.0 | Moderate |
| 4295 | Bengaluru | 2015-01-02 | 48.485 | 95.71 | 6.05 | 19.73 | 14.14 | 23.74 | 1.35 | 3.97 | 22.77 | 0.65 | 5.31 | 0.96 | 118.0 | Moderate |
| 4296 | Bengaluru | 2015-01-03 | 48.485 | 95.71 | 11.91 | 19.88 | 20.72 | 4.32 | 17.40 | 13.61 | 12.03 | 0.53 | 19.25 | 0.96 | 118.0 | Moderate |
| 4297 | Bengaluru | 2015-01-04 | 48.485 | 95.71 | 7.45 | 21.61 | 16.88 | 0.87 | 5.05 | 6.52 | 17.70 | 0.55 | 7.47 | 0.96 | 118.0 | Moderate |
| 4298 | Bengaluru | 2015-01-05 | 48.485 | 95.71 | 9.52 | 22.17 | 21.76 | 31.38 | 1.83 | 4.71 | 12.72 | 0.40 | 4.36 | 0.96 | 118.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6298 | Bengaluru | 2020-06-27 | 16.600 | 29.48 | 3.06 | 13.68 | 13.07 | 6.88 | 0.67 | 7.29 | 15.69 | 0.21 | 1.18 | 0.96 | 51.0 | Satisfactory |
| 6299 | Bengaluru | 2020-06-28 | 20.440 | 26.34 | 2.69 | 10.33 | 10.58 | 6.58 | 0.66 | 6.60 | 17.59 | 0.12 | 0.94 | 0.96 | 61.0 | Satisfactory |
| 6300 | Bengaluru | 2020-06-29 | 28.680 | 29.27 | 3.62 | 12.12 | 12.94 | 6.80 | 0.56 | 6.33 | 16.99 | 0.17 | 1.17 | 0.96 | 65.0 | Satisfactory |
| 6301 | Bengaluru | 2020-06-30 | 14.470 | 24.26 | 4.61 | 12.69 | 15.00 | 6.82 | 0.56 | 6.45 | 16.08 | 0.18 | 0.86 | 0.96 | 63.0 | Satisfactory |
| 6302 | Bengaluru | 2020-07-01 | 17.500 | 30.48 | 3.95 | 13.25 | 14.83 | 7.42 | 0.54 | 6.66 | 15.40 | 0.27 | 0.65 | 0.96 | 43.0 | Good |
2009 rows × 16 columns
01:20:40 - cmdstanpy - INFO - Chain [1] start processing 01:20:40 - cmdstanpy - INFO - Chain [1] done processing
<prophet.forecaster.Prophet at 0x1ee2cf6f580>
| ds | |
|---|---|
| 2369 | 2021-06-27 |
| 2370 | 2021-06-28 |
| 2371 | 2021-06-29 |
| 2372 | 2021-06-30 |
| 2373 | 2021-07-01 |
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 2369 | 2021-06-27 | 51.442944 | 8.732539 | 93.997069 |
| 2370 | 2021-06-28 | 51.458122 | 8.943406 | 94.890603 |
| 2371 | 2021-06-29 | 54.972129 | 9.719688 | 98.774818 |
| 2372 | 2021-06-30 | 54.693472 | 14.603014 | 99.555499 |
| 2373 | 2021-07-01 | 53.862229 | 11.849875 | 96.461272 |
0%| | 0/5 [00:00<?, ?it/s]
01:20:52 - cmdstanpy - INFO - Chain [1] start processing 01:20:53 - cmdstanpy - INFO - Chain [1] done processing 01:21:03 - cmdstanpy - INFO - Chain [1] start processing 01:21:03 - cmdstanpy - INFO - Chain [1] done processing 01:21:12 - cmdstanpy - INFO - Chain [1] start processing 01:21:13 - cmdstanpy - INFO - Chain [1] done processing 01:21:22 - cmdstanpy - INFO - Chain [1] start processing 01:21:23 - cmdstanpy - INFO - Chain [1] done processing 01:21:32 - cmdstanpy - INFO - Chain [1] start processing 01:21:33 - cmdstanpy - INFO - Chain [1] done processing
Cross Validation accuracy: 63.49849589422245
By using this model we are getting the accuracy of 63.498 for Bengaluru which is really good for using this model. This model can be used for predicting the trends in the AQI for Bengaluru of upcoming years.
Printing the trend of AQI in Bengaluru for upcoming years and finding the yearly, monthly, and weekly behaviour.
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | Air_quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7834 | Chennai | 2015-01-01 | 48.485 | 95.71 | 16.30 | 15.39 | 22.68 | 4.59 | 1.17 | 9.20 | 11.35 | 0.17 | 2.96 | 0.96 | 118.0 | Moderate |
| 7835 | Chennai | 2015-01-02 | 48.485 | 95.71 | 16.49 | 13.42 | 23.09 | 7.83 | 1.23 | 8.61 | 9.16 | 0.13 | 2.96 | 0.96 | 118.0 | Moderate |
| 7836 | Chennai | 2015-01-03 | 48.485 | 95.71 | 9.72 | 19.56 | 9.99 | 4.63 | 0.77 | 48.23 | 13.45 | 0.03 | 2.96 | 0.96 | 118.0 | Moderate |
| 7837 | Chennai | 2015-01-04 | 48.485 | 95.71 | 9.60 | 16.20 | 11.71 | 5.23 | 1.00 | 27.96 | 10.33 | 1.07 | 2.96 | 0.96 | 118.0 | Moderate |
| 7838 | Chennai | 2015-01-05 | 48.485 | 95.71 | 9.16 | 16.30 | 12.94 | 5.50 | 0.90 | 16.60 | 9.36 | 0.57 | 2.96 | 0.96 | 118.0 | Moderate |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 9838 | Chennai | 2020-06-27 | 26.420 | 39.30 | 7.25 | 12.96 | 19.59 | 33.20 | 1.10 | 7.29 | 68.51 | 0.10 | 0.07 | 0.96 | 95.0 | Satisfactory |
| 9839 | Chennai | 2020-06-28 | 25.930 | 45.54 | 7.81 | 10.00 | 16.39 | 35.98 | 0.76 | 6.48 | 77.45 | 0.09 | 2.96 | 0.96 | 98.0 | Satisfactory |
| 9840 | Chennai | 2020-06-29 | 21.300 | 22.21 | 7.65 | 9.69 | 16.74 | 34.07 | 0.96 | 6.62 | 62.57 | 0.09 | 0.01 | 0.96 | 104.0 | Moderate |
| 9841 | Chennai | 2020-06-30 | 24.140 | 30.66 | 8.42 | 12.38 | 20.29 | 34.17 | 1.05 | 7.50 | 68.75 | 0.17 | 0.16 | 0.96 | 110.0 | Moderate |
| 9842 | Chennai | 2020-07-01 | 15.950 | 4.85 | 6.22 | 10.72 | 16.44 | 33.52 | 1.02 | 9.23 | 48.37 | 0.09 | 2.96 | 0.96 | 92.0 | Satisfactory |
2009 rows × 16 columns
01:21:45 - cmdstanpy - INFO - Chain [1] start processing 01:21:46 - cmdstanpy - INFO - Chain [1] done processing
<prophet.forecaster.Prophet at 0x1ee27552b30>
| ds | |
|---|---|
| 2369 | 2021-06-27 |
| 2370 | 2021-06-28 |
| 2371 | 2021-06-29 |
| 2372 | 2021-06-30 |
| 2373 | 2021-07-01 |
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 2369 | 2021-06-27 | 65.917531 | 4.447421 | 124.194548 |
| 2370 | 2021-06-28 | 63.813983 | 5.102808 | 127.140649 |
| 2371 | 2021-06-29 | 69.038115 | 15.507342 | 127.347812 |
| 2372 | 2021-06-30 | 73.645892 | 16.467333 | 135.093176 |
| 2373 | 2021-07-01 | 73.117267 | 14.157471 | 128.271758 |
0%| | 0/5 [00:00<?, ?it/s]
01:21:57 - cmdstanpy - INFO - Chain [1] start processing 01:21:57 - cmdstanpy - INFO - Chain [1] done processing 01:22:07 - cmdstanpy - INFO - Chain [1] start processing 01:22:07 - cmdstanpy - INFO - Chain [1] done processing 01:22:17 - cmdstanpy - INFO - Chain [1] start processing 01:22:17 - cmdstanpy - INFO - Chain [1] done processing 01:22:27 - cmdstanpy - INFO - Chain [1] start processing 01:22:28 - cmdstanpy - INFO - Chain [1] done processing 01:22:37 - cmdstanpy - INFO - Chain [1] start processing 01:22:37 - cmdstanpy - INFO - Chain [1] done processing
Cross Validation accuracy: 61.76724564183594
By using this model we are getting the accuracy of 61.767 for Chennai which is really good for using this model. This model can be used for predicting the trends in the AQI for Chennai of upcoming years.
Printing the trend of AQI in Chennai for upcoming years and finding the yearly, monthly, and weekly behaviour.